Evidence for the rare decay $\Sigma^+ \to p \mu^+ \mu^-$

A search for the rare decay $\Sigma^+ \to p \mu^+ \mu^-$ is performed using $pp$ collision data recorded by the LHCb experiment at centre-of-mass energies $\sqrt{s} = 7$ and $8$ TeV, corresponding to an integrated luminosity of $3 fb^{-1}$. An excess of events is observed with respect to the background expectation, with a signal significance of 4.1 standard deviations. No significant structure is observed in the dimuon invariant mass distribution, in contrast with a previous result from the HyperCP experiment. The measured $\Sigma^+ \to p \mu^+ \mu^-$ branching fraction is $(2.2\,^{+\,1.8}_{-\,1.3})\times 10^{-8}$, where statistical and systematic uncertainties are included, which is consistent with the Standard Model prediction.

The Σ + → pµ + µ − decay is an s → d quark-flavour-changing neutral-current process, allowed only at loop level in the standard model (SM).The process is dominated by long-distance contributions for a predicted branching fraction of 1.6 × 10 −8 < B(Σ + → pµ + µ − ) < 9.0 × 10 −8 [1], while the short-distance SM contributions are suppressed and contribute to the branching fraction at the level of about 10 −12 .Evidence for this decay was reported by the HyperCP collaboration [2] with a measured branching fraction B(Σ + → pµ + µ − ) = (8.6 + 6.6  − 5.4 ± 5.5) × 10 −8 , which is compatible with the SM prediction.HyperCP observed three candidates; remarkably, all of them have almost the same dimuon invariant mass of m X 0 = 214.3± 0.5 MeV/c 2 , close to the lower kinematic limit.Such a distribution, if confirmed, would point towards a process with an intermediate particle X 0 coming from the Σ + baryon and decaying into two muons, i.e. a Σ + → pX 0 (→ µ + µ − ) decay, which would constitute evidence for physics beyond the SM (BSM).Various BSM theories have been proposed to explain the HyperCP result.The intermediate X 0 particle could be, for example, a light pseudoscalar Higgs boson [3,4] or a sgoldstino [5,6] in various supersymmetric models.Other interpretations and implications can be found in Refs.[7][8][9][10][11][12][13]; in general a pseudoscalar particle is favoured over a scalar particle and a lifetime of the order of 10 −14 s is estimated for the former case.Attempts to confirm the existence of this X 0 particle have been made at several experiments in various initial and final states without finding any signal [14][15][16][17][18][19][20][21]; these null results include studies of the decays B 0 (s) → µ , and a search for photon-like particles [25] by the LHCb experiment.However, the search for the Σ + → pµ + µ − decay has not been repeated due to the lack of experiments with large hyperon production rates and to the experimental difficulty of reconstructing soft and long-lived hadrons.
Hyperons are produced copiously in high-energy proton-proton collisions at the Large Hadron Collider.A search for Σ + → pµ + µ − decays at the LHCb experiment, as also suggested in Ref. [26], could therefore confirm or disprove the HyperCP evidence, and the branching fraction can be measured.This Letter presents a search for the Σ + → pµ + µ − decay performed using pp collision data recorded by the LHCb experiment at centre-ofmass energies √ s = 7 and 8 TeV, corresponding to an integrated luminosity of 3 fb −1 .The inclusion of charge-conjugated processes is implied throughout this Letter.
This search follows a strategy similar to that of other studies of rare decays in LHCb, although with differences due to the relatively low transverse momenta of the final-state particles.First, a loose selection is applied based on geometric and kinematic variables.The final sample is obtained rejecting the background with requirements on the output of a multivariate selection, based on a boosted decision tree algorithm (BDT) [27,28], and on particle identification variables.The signal yield is obtained from a fit to the pµ + µ − invariant-mass spectrum and is converted into a branching fraction by normalising to the Σ + → pπ 0 control channel.The analysis is designed in order to search for possible peaks in the dimuon invariant-mass distribution, in view of the possible existence of unknown intermediate particles.
The LHCb detector is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, described in detail in Refs.[29,30].It includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream of the magnet.Particle identification is provided by two ring-imaging Cherenkov detectors, an electromagnetic and a hadronic calorimeter, and a muon system composed of alternating layers of iron and multiwire proportional chambers.
The online event selection is performed by a trigger system, which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by two software stages.The first software stage performs a preliminary event reconstruction based on partial information while the second applies a full event reconstruction.Each of the three trigger stages is divided into many trigger selections dedicated to various types of signal.The final-state particles from the signal decay involved in this analysis typically have insufficient transverse momenta to satisfy the requirements of one or more trigger stages.Nevertheless, given the large production rate of Σ + baryons in pp collisions, the present search can be performed with data selected at one or more trigger stages by other particles in the event.In the offline processing, trigger decisions are associated with reconstructed candidates.A trigger decision can thus be ascribed to the reconstructed candidate, the rest of the event or a combination of both; events triggered as such are defined respectively as triggered on signal (TOS), triggered independently of signal (TIS), and triggered on both.While all the candidates passing the trigger selection are used in the search for Σ + → pµ + µ − decays, only the TIS candidates are used in the normalisation channel Σ + → pπ 0 .Furthermore, control channels with large yields are exploited to estimate the trigger efficiency by measuring the overlap of candidates which are TIS and TOS simultaneously [31].
Simulation is used to devise and optimise the analysis strategy, as well as to estimate reconstruction and selection efficiencies.In the simulation, pp collisions are generated using Pythia [32] with a specific LHCb configuration [33].Decays of hadronic particles are described by EvtGen [34], in which final-state radiation is generated using Photos [35].The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [36], as described in Ref. [37].The signal Σ + → pµ + µ − decay is generated according to a phase-space model.
Candidate Σ + → pµ + µ − decays are selected by combining two good-quality oppositely charged tracks identified as muons with a third track identified as a proton.The three tracks are required to form a secondary vertex (SV) with a good vertex-fit quality.The short lifetime estimated for the X 0 particle would result in a prompt signal in this search, hence no attempt is made to distinguish the dimuon origin vertex from the SV of the Σ + baryon.The measured Σ + candidate proper decay time is required to be greater than 6 ps, ensuring that the SV is displaced from any pp interaction vertex (primary vertex, PV).The final-state particles are required to be inconsistent with originating from any PV in the event.Only Σ + candidates with transverse momentum p T > 0.5 GeV/c and a decay topology consistent with a particle originating from the PV are retained.A candidate where m Σ + is the known mass of the Σ + particle [38].The background component due to Λ → pπ − decays is vetoed by discarding candidates having a pµ − pair invariant mass, calculated with the pπ − mass hypothesis, within 10 MeV/c 2 from the known Λ mass [38].Possible backgrounds from decays peaking in the pµ + µ − invariant mass have been examined, including K + → π + π − π + , K + → π + µ − µ + , and various hyperon decays, and none has been found to contribute significantly.After all selection requirements, no retained event contains more than one candidate.
Candidate Σ + → pπ 0 decays are selected by combining one good-quality track identified as a proton with a π 0 reconstructed in the π 0 → γγ mode from two clusters in the electromagnetic calorimeter.Given the impossibility to reconstruct the Σ + decay SV with the proton track only, the momentum direction of the π 0 is calculated assuming the π 0 is produced at the PV.The selection of Σ + → pπ 0 decays is similar to that of the signal, with tighter requirements applied, in order to reduce the large combinatorial background, on the proton identification and on the transverse momenta of the final-state particles (p T > 0.5 GeV/c for the proton and p T > 0.7 GeV/c for the π 0 ).Finally, candidate K + → π + π − π + decays, selected as control channel for various parts of the analysis, are required to pass a selection similar to that of the signal, starting from three good-quality tracks, with total charge equal to ±1, and which are assigned the pion mass hypothesis without requirements on the identification of the particle.
The sample of Σ + → pµ + µ − candidates in data after the initial selection is dominated by combinatorial background, part of which is due to misidentified particles.This background is rejected by placing requirements on the BDT output variable and on multivariate particle identification variables [30] on the muons and on the proton.The BDT combines information from the following input variables: the angle between the Σ + reconstructed momentum and the vector joining the PV to the SV, the flight distance significance of the Σ + candidate, the distance of closest approach among the final-state particles, the transverse momenta of the final-state particles, the impact parameter χ 2 (χ 2 IP ) of the final-state particles, defined as the difference between the vertex-fit χ 2 of a PV formed with and without the particle in question, the χ 2 IP of the Σ + candidate, the χ 2 of the SV, and an isolation variable constructed from the number of tracks within an angular cone around each of the final-state particles.These variables are chosen so that the dependence on the pµ + µ − invariant mass and on the dimuon invariant mass is small and linear to minimise potential biases.The BDT is optimised using simulated samples of Σ + → pµ + µ − events for the signal and pµ + µ + candidates in data for the background.The selection for the control pµ + µ + sample is identical to that of the signal but considering muons of identical charge.The final selection criteria are chosen in order to optimise the potential to obtain evidence for a signal with a branching fraction as small as possible [39].No BDT selection is applied to the normalisation and control channels.
The number of signal candidates is converted into a branching fraction with the formula where ε, N and B are the efficiency, candidate yield and branching fraction of the corresponding channel, respectively, and α is the single-event sensitivity.The ratio of signal and normalisation channel efficiencies, which includes the acceptance, the trigger efficiency, the reconstruction efficiency of the final-state particles and the selection efficiency, is computed with samples of simulated events corrected to take into account known differences between data and simulation.The reconstruction efficiency for the π 0 is calibrated using the ratio of B + → J/ψK * + (→ K + π 0 ) and B + → J/ψK + decays reconstructed in data [40].The particle-identification efficiencies of protons and muons are calibrated with control channels in data.Residual differences between data and simulation are treated as sources of systematic uncertainty.The ratio of the trigger efficiencies for the signal and normalisation channels is estimated with simulated samples and cross-checked in data: the trigger efficiency is obtained for selected trigger lines from the overlap of TIS and TOS events in the normalisation channel and is compared between data and  simulation [31].The small size of this overlap induces a 40% relative systematic uncertainty associated with the trigger efficiency ratio.The ratio of the trigger efficiencies is of the order of 0.09, owing to the use of all events for the signal, while TIS-only events are used for the normalisation channel.Possible differences in the BDT selection efficiency for the Σ + → pµ + µ − signal in data and in simulation are calibrated using the K + → π + π − π + control channel.The sources of systematic uncertainties associated with the normalisation are reported in Table 1.
The observed number of Σ + → pπ 0 candidates is (1171 ± 9) × 10 3 , as obtained from a binned extended maximum likelihood fit to the corrected invariant mass distribution m corr Σ .The corrected invariant mass is defined as m corr Σ = m pγγ − m γγ + m π 0 , where m π 0 is the known mass of the π 0 meson [38], to account for the limited precision in the reconstructed invariant mass of the two photons (m γγ ).The Σ + → pπ 0 distribution is described as a Gaussian function with a power-law tail on the higher-mass side, while the background is described by a modified ARGUS function [41], where the power parameter is allowed to vary as in Ref. [42].The distribution is shown in Fig. 1, superimposed with the fit.
The single-event sensitivity is α = (2.2±1.2)×10−9 , where the uncertainty is dominated by the systematic contribution.This sensitivity corresponds to about 10 14 Σ + baryons produced in the LHCb acceptance in the considered dataset.The number of expected signal Σ + → pµ + µ − candidates is 23 ± 20 assuming a branching fraction of (5 ± 4) × 10 −8 , to cover the range predicted by the SM.
The observed number of signal Σ + → pµ + µ − decays is obtained with a fit to the pµ + µ − invariant-mass distribution in the range 1149.6 < m pµ + µ − < 1409.6 MeV/c 2 .The signal distribution is described by an Hypatia function [43].The peak position and resolution are calibrated using the control channel K + → π + π − π + and by comparing distributions in data and simulation.No bias is seen in the peak position, while a relative positive correction of 25% with respect to the simulation is applied to the resolution.A resolution of 4.28 ± 0.19 MeV/c 2 is obtained for the signal Σ + → pµ + µ − distribution and is used in the fit to define a Gaussian constraint to the width of the signal distribution.The combinatorial background is described as a modified ARGUS function with all parameters left free with the exception of the threshold, which is fixed to the kinematic limit.The shape of this background is also cross-checked with that of pµ + µ + candidates in data.
The invariant mass distribution of the Σ + → pµ + µ − candidates in data is shown in Fig. 2. The significance of the signal is 4.1 σ, obtained from a comparison of the likelihood value of the nominal fit with that of a background-only fit [44], and with the relevant systematic uncertainties included as Gaussian constraints to the likelihood.A signal yield of 10.2 + 3.9 − 3.5 is observed.The corresponding branching fraction is B(Σ + → pµ + µ − ) = (2.2 + 0.9 − 0.8 + 1.5 − 1.1 ) × 10 −8 , where the first uncertainty is statistical and the second is systematic, consistent with the SM prediction.As a cross-check, the fit is repeated with tighter or looser requirements on the BDT or on the particle identification variables, and the signal yield is found to vary consistently with the signal efficiency.The fit is also repeated assuming a linear function for the background, in place of an ARGUS function, and the signal yield and significance are found to be stable.Candidates in data are composed of about 48% Σ + anti-baryons in the final sample.
The distribution of the dimuon invariant mass after background subtraction, performed with the sPlot method [45], is shown in Fig. 3.A scan for a possible resonant structure in the dimuon invariant mass is performed, considering a region within two times the resolution in the pµ + µ − invariant mass around the known Σ + mass.The distribution of these candidates as a function of the dimuon invariant mass is shown in the supplemental material to this Letter [46].Steps of half the resolution on the dimuon invariant mass, σ(m µ + µ − ), are considered in this scan, following the method outlined in Ref. [47].The value of σ(m µ + µ − ) varies in the range [0.3, 2.3] MeV/c 2 depending on the dimuon invariant mass as shown in Ref. [46].For each step the putative signal is estimated in a window of ±1.5 × σ(m µ + µ − ) around the considered particle mass, while the background is estimated from the lower and upper sidebands contained in the range [1.5 − 4.0] × σ(m µ + µ − ) from the same mass.Only one of the two sidebands is considered when the second is outside the allowed kinematic range.The local p-value of the background-only hypothesis as a function of the dimuon mass is shown in Ref. [46], and no significant signal is found.The fit to the pµ + µ − invariant mass is then repeated restricting the sample to events within 1.5 times the resolution from the putative particle (m µ + µ − ∈ [214.3 ± 0.75] MeV/c 2 ).No significant signal is found and a yield of 3.0 + 1.7 − 1.4 is measured corresponding to 30% of the Σ + → pµ + µ − yield.An upper limit on the branching fraction of the resonant channel is thus set with the CL S method [48] In summary, a search for the Σ + → pµ + µ − rare decay is performed by the LHCb experiment using pp collisions at centre-of-mass energies √ s = 7 and 8 TeV, corresponding to an integrated luminosity of 3 fb −1 .Evidence for the Σ + → pµ + µ − decay is found with a significance of 4.1 standard deviations, including systematic uncertainties.A branching fraction B(Σ + → pµ + µ − ) = (2.2 + 1.8 − 1.3 ) × 10 −8 is measured, consistent with the SM prediction.No significant peak consistent with an intermediate particle is found in the dimuon invariant-mass distribution of the signal candidates.

Figure 1 :
Figure 1: Distribution of the corrected mass m corr Σ , defined as in the text, for Σ + → pπ 0 candidates superimposed with the fit to data.

Figure 3 :
Figure3: Background-subtracted distribution of the dimuon invariant mass for Σ + → pµ + µ − candidates, superimposed with the distribution from the simulated phase-space (PS) model.Uncertainties on data points are calculated as the square root of the sum of squared weights.

Table 1 :
Relative systematic uncertainties associated with the normalisation.
Universidade Federal do Triângulo Mineiro (UFTM), Uberaba-MG, Brazil b Laboratoire Leprince-Ringuet, Palaiseau, France c P.N.Lebedev Physical Institute, Russian Academy of Science (LPI RAS), Moscow, Russia d Università di Bari, Bari, Italy e Università di Bologna, Bologna, Italy f Università di Cagliari, Cagliari, Italy g Università di Ferrara, Ferrara, Italy h Università di Genova, Genova, Italy i Università di Milano Bicocca, Milano, Italy j Università di Roma Tor Vergata, Roma, Italy k Università di Roma La Sapienza, Roma, Italy l AGH -University of Science and Technology, Faculty of Computer Science, Electronics and Telecommunications, Kraków, Poland m LIFAELS, La Salle, Universitat Ramon Llull, Barcelona, Spain n Hanoi University of Science, Hanoi, Vietnam o Università di Padova, Padova, Italy p Università di Pisa, Pisa, Italy q Università degli Studi di Milano, Milano, Italy r Università di Urbino, Urbino, Italy s Università della Basilicata, Potenza, Italy t Scuola Normale Superiore, Pisa, Italy u Università di Modena e Reggio Emilia, Modena, Italy v Iligan Institute of Technology (IIT), Iligan, Philippines w Novosibirsk State University, Novosibirsk, Russia x National Research University Higher School of Economics, Moscow, Russia † Deceased a