Search for the lepton number violating decay $\Sigma^{-} \to p e^{-} e^{-}$ and the rare inclusive decay $\Sigma^{-} \to \Sigma^{+} X$

Using a data sample of $(1310.6 \pm 7.0) \times 10^{6}$ $J/\psi$ events taken with the BESIII detector at the center-of-mass energy of 3.097 GeV, we search for the first time for the lepton number violating decay $\Sigma^{-} \to p e^{-} e^{-}$ and the rare inclusive decay $\Sigma^{-} \to \Sigma^{+} X$, where $X$ denotes any possible particle combination. The $\Sigma^-$ candidates are tagged in $J/\psi \to \bar{\Sigma}(1385)^+\Sigma^-$ decays. No signal candidates are found, and the upper limits on the branching fractions at the 90\% confidence level are determined to be $\mathcal{B}(\Sigma^{-} \to p e^{-} e^{-})<6.7 \times 10^{-5}$ and $\mathcal{B}(\Sigma^{-} \to \Sigma^{+} X)<1.2 \times 10^{-4}$.

Currently, we cannot distinguish whether neutrinos are Dirac fermions or Majorana fermions. Hence, it is important to investigate the validity of lepton-number conservation directly. Observation of lepton number violating (LNV) processes would explicitly point out the direction of new physics, while experimental upper limits (ULs) could translate into stringent conditions for theoretical models.
A number of experiments have searched for LNV in meson decays [19], while only a few experiments have reported searches in hyperon decays [20,21]. The LNV decay of B − 1 → B + 2 l − l − (B = baryon; l = e, µ) is a unique process, in which two down-type (d or s) quarks convert into two up-quarks changing the charge of the hyperons according to the ∆Q = ∆L = 2 rule, where ∆Q and ∆L are the changes of charge number and lepton number, respectively. The transition of the quarks is assumed to occur at the same space-time location, as shown in Fig. 1, and is determined by local four-quark operators [22][23][24]. The underlying mechanism is similar to that of neutrinoless double beta (0νββ) nuclear decay (A, Z) → (A, Z + 2)e − e − , which is a sensitive probe in the search for the effects of very light Majorana neutrinos [25,26]. In Refs. [22,23], based on a model where the dominant contributions are given by a loop of a virtual baryon and a Majorana neutrino, as shown in Fig. 2, the predicted branching fractions of Σ − → pe − e − and Σ − → Σ + e − e − , can reach 10 −31 and 10 −35 , respectively. While in Ref. [24], based on the Massachusetts Institute of Technology (MIT) bag model [27,28], the branching fractions are increased by several orders of magnitude, and, for example, the branching fraction of Σ − → pe − e − can reach 10 −23 .
, in which two down type quarks convert into two up type quarks and two leptons.
, where a loop of a virtual baryon, B0, and a Majorana neutrino, ν, is introduced.
In this paper, using the process J/ψ →Σ(1385) + Σ − [19] from the data sample of (1310.6 ± 7.0) × 10 6 J/ψ events [29][30][31] collected with the BESIII detector, we present the first search for the ∆Q = ∆L = 2 process in Σ − decays. In the channel Σ − → Σ + e − e − , due to the limited phase space (M Σ − − M Σ + ≃ 8 MeV/c 2 ) and the small momentum of the Σ − , the leptons have very small momenta and cannot be reconstructed in the detector. Therefore, the processes investigated in this analysis are Σ − → pe − e − and the rare inclusive decay Σ − → Σ + X, where X represents any particles or particle combinations, including e − e − . Throughout this paper, the charge conjugate channelsΣ + →pe + e + andΣ + →Σ −X are investigated at the same time. It is important to note that the blind analysis method is used in this paper. The Monte Carlo sample is used to determine the analysis strategy. Then, with fixed strategy, the data sample is opened to obtain the final results.

II. BESIII DETECTOR AND MONTE CARLO SIMULATION
The BESIII detector is a magnetic spectrometer [32] located at the Beijing Electron Positron Collider (BEPCII) [33]. The cylindrical core of the BESIII detector consists of a helium-based multilayer drift chamber (MDC), a plastic scintillator time-of-flight system (TOF), and a CsI(Tl) electromagnetic calorimeter (EMC), which are all enclosed in a superconducting solenoidal magnet providing a 1.0 T (0.9 T in 2012) magnetic field. The solenoid is supported by an octagonal flux-return yoke with resistive plate counter muon identifier modules interleaved with steel. The acceptance of charged particles and photons is 93% over 4π solid angle. The charged-particle momentum resolution at 1 GeV/c is 0.5%, and the dE/dx resolution is 6% for the electrons from Bhabha scattering. The EMC measures photon energies with a resolution of 2.5% (5%) at 1 GeV in the barrel (end cap) region. The time resolution of the TOF barrel part is 68 ps, while that of the end cap part is 110 ps.
Simulated samples produced with the geant4based [34] Monte Carlo (MC) software, which includes the geometric description of the BESIII detector and the detector response, are used to determine the detection efficiency and to estimate the backgrounds. The simulation includes the beam energy spread and initial state radiation (ISR) in the e + e − annihilations modeled with the generator kkmc [35,36].
The inclusive MC sample consists of the production of the J/ψ resonance, and the continuum processes (e + e − → qq) incorporated in kkmc [35,36]. The known decay modes of J/ψ are modeled with evtgen [37,38] using branching fractions taken from the Particle Data Group [19], and the remaining unknown decays from the charmonium states with lundcharm [39,40]. The final state radiation (FSR) from charged final state particles are incorporated with the photos [41] package.

III. EVENT SELECTION
In this paper, the Σ − data sample is obtained through the process J/ψ →Σ(1385) + Σ − . A double-tag method, which was developed by the MARK-III experiment [42], is employed to determine the absolute branching fraction and reduce the systematic uncertainties. First, we reconstructΣ(1385) + via the decayΣ(1385) + →Λπ + and then determine the number of Σ − events in the recoil mass spectrum of theΣ(1385) + , which is defined in Eq. (2). These events are referred to as 'single tag' (ST) events. Next, we search for signal candidates in the selected Σ − sample by looking directly for their decay products. Events with signal candidates are called 'double tag' (DT) events. The absolute branching fraction is calculated by where N obs ST is the ST yield, N obs DT is the DT yield, ǫ ST and ǫ DT are the ST and the DT efficiencies.

A. ST event selection
In the selection of ST events,Σ(1385) + is reconstructed via theΣ(1385) + →Λπ + decay. All charged tracks are required to have a polar angle within | cos θ| < 0.93. TheΛ is reconstructed viaΛ →pπ + decay. Each track used to reconstructΛ is required to have a distance of closest approach to the interaction point (IP) along the beam direction less than 20 cm, while the bachelor pion candidates are required to have a distance of closest approach to the IP less than 1 cm in the plane perpendicular to the beam and less than 10 cm along the beam direction. The values of 20 cm, 1 cm and 10 cm are based on track performance study.
We perform particle identification (PID) on the charged tracks with the information of dE/dx measured in the MDC and the time of flight measured by the TOF. The confidence levels (CLs) for the pion, kaon, and proton hypotheses (CL π , CL K , and CL p ) are calculated. The anti-proton candidates are required to satisfy CL p > 0.001, CL p > CL π , and CL p > CL K . The bachelor pion candidates are required to satisfy CL π > 0.001 and CL π > CL K , while there is no PID requirement for the pion fromΛ decay.
The two charged tracks used to reconstructΛ are constrained to originate from a common decay vertex by performing a primary vertex fit on the two tracks. The χ 2 1 , which represents the goodness of the primary vertex fit, is required to be less than 100. A secondary vertex fit is also performed on the same daughter tracks ofΛ candidates, imposing the additional constraint that the momentum of the candidate points back to the IP. The χ 2 2 of the secondary vertex fit is required to be less than 100.
The two values of χ 2 requirements result in high quality vertex fits. To further suppress non-Λ background, the decay length ofΛ, which is the distance between the IP and the secondary vertex, is required to be larger than 2 standard deviations of the decay length. The fitted fourmomentum of thepπ + combination is used in further analysis, and the invariant mass of thepπ + combination is required to be within (1.112, 1.120) GeV/c 2 .
The recoil mass ofΣ(1385) + is defined as where, E J/ψ p J/ψ , EΛ( pΛ) and E π + ( p π + ) are the energies (momenta) of J/ψ,Λ and π + in the J/ψ's center-ofmass frame. To suppress backgrounds, such as J/ψ → Σ(1385) + Σ(1385) − , the sum of EΛ and E π + is required to be within (1.59, 1.70) GeV. AllΣ(1385) + candidates in an event are retained. We then fit the M recoil distribution to obtain the ST yield. Figure 3 shows the fit to the M recoil distribution of data. In the fit, the background is described by a second order Chebychev polynomial function, and the signal shape is modeled by MC simulated shape convolved with a Gaussian function to account for the resolution difference between data and MC simulation.

B. DT event selection
In the recoil side of the selected ST events, we search for the LNV process Σ − → pe − e − and the rare inclusive decay Σ − → Σ + X, using the charged tracks and electromagnetic showers not used previously. Each charged track is also required to have a polar angle within | cos θ| < 0.93 and a distance of closest approach to the interaction IP along the beam direction less than 20 cm. The momentum of the Σ − is small and the phase space in Σ − → Σ + X is extremely small like that in Σ − → Σ + e − e − due to the small difference in the Σ ± masses. Therefore the momenta of particles in the Σ − → Σ + X decay, except for Σ + reconstructed via Σ + → pπ 0 , are too small to reach the MDC and other detectors. So only three charged tracks are required for Σ − → pe − e − and one charged track for Σ − → Σ + X.
Proton PID is performed as above. Electron PID is performed using the dE/dx, TOF, and EMC informa-tion, with which the CLs for electron, pion and kaon hypotheses (CL e , CL π and CL K ) are calculated. Electron candidates are required to satisfy CL e > 0.001 and CL e /(CL e + CL π + CL K ) > 0.8.
Electromagnetic showers are reconstructed from clusters of energy deposited in the EMC. The photon candidate showers must have a minimum energy of 25 MeV in the barrel region (| cos θ| < 0.80) or 50 MeV in the end cap regions (0.86 < | cos θ| < 0.92). To suppress electronic noise and energy deposits unrelated to the event, timing information from the EMC for the photon candidates must be in coincidence with collision events, with a requirement of 0 ≤ t ≤ 700 ns. The π 0 candidates are reconstructed from pairs of photon candidates. Due to the worse resolution in the end cap regions of the EMC, π 0 candidates reconstructed with both photons in the end caps of the EMC are rejected. The invariant mass of two photons is required to be within (0.115, 0.150) GeV/c 2 for π 0 candidates. To improve the overall kinematic resolution, a mass-constraint kinematic fit is performed by constraining the γγ invariant mass to the nominal π 0 mass [19]. When multiple π 0 candidates are reconstructed, we retain the one with the smallest χ 2 of the massconstraint kinematic fit.
Furthermore, we require the pe − e − invariant mass to be within (1.169, 1.209) GeV/c 2 and the pπ 0 invariant mass to be within (1.167, 1.201) GeV/c 2 . For the Σ − → Σ + X channel, to further suppress background, one additional kinematic variable is defined as where p are the corresponding momenta in the J/ψ's center-of-mass system, and | p miss | is required to be less than 0.1 GeV/c. Since the X particles are not detected, the DT efficiency of Σ − → Σ + X is only affected by the reconstruction of the p and π 0 from the Σ + decays. Simulation studies show that due to the limited phase space and small Σ − momentum, the momenta and the angular distributions of p and π 0 are almost the same when X represents different final states. Therefore, we use the MC samples of Σ − → Σ + e − e − to estimate the efficiency of Σ − → Σ + X.
To determine the DT yield, we search for candidates in the M recoil distributions for Σ − → pe − e − and Σ − → Σ + X in data, shown in Fig. 4. The signal region is defined as [1.165, 1.220] GeV/c 2 , which covers more than 99.7% of all signal events. The DT efficiencies obtained from MC simulation are 9.02% and 11.08%, respectively. No event is observed in the signal region for either channel.

C. Background Study
Potential background candidates come from the continuum process and from other J/ψ decays. To estimate the first kind, we study the continuum process with data samples collected at √ s = 3.08, 3.65, 3.773 GeV, where We use the inclusive MC sample to estimate backgrounds from J/ψ decays. The ST peaking background component is from J/ψ → Σ(1385) 0Λ +c.c., and the number of events scaled to data accounts for about 0.07% of the total ST yield, so it is ignored. The smooth background components can be described by a second order Chebychev polynomial. Figure 5 shows the different components from the inclusive MC sample. For the DT selection, only 4 and 3 background events survive in the signal regions of the Σ − → pe − e − and Σ − → Σ + X channels, respectively, corresponding to normalized numbers of 0.3 and 0.6 background events.

IV. SYSTEMATIC UNCERTAINTIES
The systematic uncertainties in the measurements, summarized in Table I, mainly originate from differences between data and simulation in the tracking and PID efficiency, the tag bias, the MC model and the cited branching fractions.
The systematic uncertainty due to the proton tracking efficiency is determined to be 1.0% for each track by studying the two control samples of J/ψ → pK −Λ + c.c. and J/ψ → ΛΛ [43]. The uncertainty arising from the proton PID efficiency is determined with the control sample J/ψ → ppπ + π − . We bin events in the sample in cos θ (i) and | p| (j) of the proton [44] and add differences between data and MC samples together with the following formula where ∆ǫ P ID ij is the difference of PID efficiency and ω P ID ij is the weight factor. The weight factor is defined as the ratio of the number of events in the ij bin to the total number of events of the sample. We use the total differences as the uncertainties in Σ − → pe − e − and Σ − → Σ + X, which are 0.4% and 0.3%, respectively. The uncertainties due to the tracking and PID efficiencies of the electron are studied with the control sample e + e − → γe + e − (at √ s = 3.097 GeV). Similar as above, we bin events in the sample by cos θ and | p t | (| p|) for tracking (PID). The uncertainties of the tracking efficiency for the high-momentum and low-momentum electrons are 0.5% and 3.1%, respectively, and the uncertainties of the PID efficiency are 0.6% and 2.2%, respectively. The total differences for tracking and PID for Σ − → pe − e − are 3.6% and 2.8%, respectively. The uncertainties associated with photon detection and π 0 reconstruction are obtained from the control sample J/ψ → π + π − π 0 [45]. The differences between data and MC samples are 1.0% per photon and 1.0% per π 0 , respectively. The systematic uncertainty due to the Σ + (pπ 0 ) mass window is determined to be 0.6% using a control sample of J/ψ → Σ +Σ− decays [46]. Since we do not have a sample of Σ − → pe − e − , we change the limits to those obtained from the mass distribution of Σ − whose decay is determined by a phase space model. The relative change of the DT efficiency, 0.5%, is taken as the uncertainty of Σ − (pe − e − ) mass window.
The main uncertainties in the ST selection, including the total number of J/ψ events, the reconstruction ofΛ and the bachelor π + , cancel in the double tag method.
The tag bias is related with the MC sample used to obtain the ST efficiency. We change the sample with the decay chain J/ψ →Σ(1385) + Σ − (Σ − → X) to the sample with the decay chain J/ψ →Σ(1385) + Σ − (Σ − → pe − e − ) and J/ψ →Σ(1385) + Σ − (Σ − → Σ + e − e − ). The average relative change of the ST efficiency is taken as the associated uncertainty, which is 1.3%. The statistical uncertainty of the tag yields is 0.38%, and it is taken into account together with the statistical uncertainties of efficiencies when calculating the ULs.
To estimate the uncertainty of the MC model for the signal, we change the values of parameters in the model that describes the q 2 -dependent differential decay width of B − 1 → B + 2 l − l − [24]. We take the relative changes of the DT efficiencies as the associated uncertainties for Σ − → pe − e − and Σ − → Σ + X, which are 1.0% and 0.9%, respectively.
Since the limit for | p miss | (0.1 GeV/c) is much larger than the kinematic limit, the uncertainty of the requirement on | p miss | is negligible. The relative uncertainties for branching fractions of π 0 → γγ and Σ + → pπ 0 are taken from the PDG [19], and are 0.034% and 0.6%, respectively. The uncertainty for the π 0 is so small that it can be ignored. The total systematic uncertainties are obtained by adding all uncertainties above in quadrature.

V. RESULTS
The ULs for the signal yields are calculated using a frequentist method with an unbounded profile likelihood treatment of systematic uncertainties, which is implemented by the class TROLKE in the ROOT framework [47]. The number of the signal and background events are assumed to follow a Poisson distribution, the detection efficiency is assumed to follow a Gaussian distribution, and the systematic uncertainty is considered as the standard deviation of the efficiency. The resulting UL for the branching fraction is determined by where s obs 90 is the upper limit on the number of signal events determined at the 90% CL. The ULs for branching fractions are B Σ − → pe − e − < 6.7 × 10 −5 , B Σ − → Σ + X < 1.2 × 10 −4 .

VI. SUMMARY
To summarize, with the data sample of (1310.6±7.0)× 10 6 J/ψ events collected by BESIII detector, a search for the LNV decay Σ − → pe − e − and the rare inclusive decay Σ − → Σ + X is performed for the first time. No signal event is observed, and the upper limits on the branching fractions of Σ − → pe − e − and Σ − → Σ + X at the 90% CL are 6.7 × 10 −5 and 1.2 × 10 − 4 , respectively. Our results are well above the prediction in references [22][23][24].