Search for an invisible $Z^\prime$ in a final state with two muons and missing energy at Belle II

The $L_{\mu}-L_{\tau}$ extension of the standard model predicts the existence of a lepton-flavor-universality-violating $Z^{\prime}$ boson that couples only to the heavier lepton families. We search for such a $Z^\prime$ through its invisible decay in the process $e^+ e^- \to \mu^+ \mu^- Z^{\prime}$. We use a sample of electron-positron collisions at a center-of-mass energy of 10.58GeV collected by the Belle II experiment in 2019-2020, corresponding to an integrated luminosity of 79.7fb$^{-1}$. We find no excess over the expected standard-model background. We set 90$\%$-confidence-level upper limits on the cross section for this process as well as on the coupling of the model, which ranges from $3 \times 10^{-3}$ at low $Z^{\prime}$ masses to 1 at $Z^{\prime}$ masses of 8$GeV/c^{2}$.

The Lµ −Lτ extension of the standard model predicts the existence of a lepton-flavor-universalityviolating Z boson that couples only to the heavier lepton families. We search for such a Z through its invisible decay in the process e + e − → µ + µ − Z . We use a sample of electron-positron collisions at a center-of-mass energy of 10.58 GeV collected by the Belle II experiment in 2019-2020, corresponding to an integrated luminosity of 79.7 fb −1 . We find no excess over the expected standard-model background. We set 90%-confidence-level upper limits on the cross section for this process as well as on the coupling of the model, which ranges from 3 × 10 −3 at low Z masses to 1 at Z masses of 8 GeV/c 2 .
Recent experimental observations are in tension with the standard model (SM) of particle physics. A notable example is the difference between the measured and expected values of the muon anomalous magnetic moment [1,2]. In addition, the SM is known to provide an incomplete description of nature since, among other prominent issues, it does not address the phenomenology related to the existence of dark-matter [3], specifically the prediction of the observed relic density. A simple way to explain both phenomena is the L µ − L τ extension of the SM [4][5][6]. This model gauges the difference of the muon and tau lepton numbers, giving rise to a massive, electrically neutral, vector boson, the Z . This particle would couple to the SM only through µ, τ , ν µ , and ν τ with coupling g . The Z , with such a lepton-flavor-universalityviolating coupling, would contribute to the muon magnetic moment and, for certain values of g and mass M Z , would explain the observed anomaly [7]. This model may resolve the tensions in flavor observables reported by the LHCb, Belle and BABAR collaborations [8][9][10][11][12][13][14][15][16][17][18]. It may also reproduce the observed dark-matter relic density, assuming dark-matter is charged under L µ − L τ . Two possible scenarios have been proposed, suggesting sterile neutrinos [5] or light Dirac fermions [6] as dark-matter candidates.
In this Letter we report a search, performed with the Belle II experiment, for the Z in the process e + e − → µ + µ − Z with Z → invisible, where the Z is radiated off a muon. We consider two different scenarios. If the Z couples only to SM particles, a model henceforth referred to as the "vanilla" L µ − L τ model, the invisible decay happens only through neutrinos, with a branching fraction B(Z → νν) that varies between ∼33% and ∼100% depending on theZ mass [19,20]. Alternatively, the Z can decay directly into a pair of dark-matter particles χχ with a coupling constant α D = g D 2 /4π, and there is no a priori reason for α D to be small. In this case, one can expect g D g , which implies B(Z → χχ) ≈ 1: we henceforth refer to this second scenario as the "fully invisible" L µ − L τ model. We provide results for each of the two scenarios for M Z < 9 GeV/c 2 . In the vanilla model, the intrinsic width Γ Z of the Z is negligible compared with the experimental resolution. In the fully invisible model, Γ Z depends on α D : it is negligible for values of α D smaller than 1 for M Z ≈ 1.5 GeV/c 2 , and smaller than 0.1 for M Z ≈ 4.5 GeV/c 2 . We focus our analysis on the case in which B(Z → χχ) ≈ 1 and Γ Z is negligible. We study separately one example in which Γ Z is not negligible and assume one benchmark value such that Γ Z = 0.1M Z , corresponding to α D = 2.9.
Searches for a Z decaying to muons have been performed by the BABAR [21], Belle [22], CMS [23] and ATLAS [24] experiments: they only constrain the vanilla L µ − L τ model in the parameter space we explore. Searches for an invisibly decaying Z have been performed by the NA64-e experiment [25] in the low-mass region and by Belle II using data collected during the commissioning run in 2018, with a luminosity of 0.276 fb −1 [26]: these searches set constraints both in the vanilla and fully invisible L µ − L τ models.
We use a sample of e + e − collisions collected by Belle II at the center-of-mass (c.m.) energy of the Υ (4S) resonance, 10.58 GeV, in 2019-2020, corresponding to a total integrated luminosity of 79.7 fb −1 [27]. This search supersedes that in Ref. [26], with an integrated luminosity nearly 300 times larger, improved muon identification, and the use of refined analysis algorithms.
The invisible-Z signature is a narrow enhancement in the distribution of the mass M recoil of the system recoiling against a muon pair. In the following, recoil quantities are computed by using the measured muon momenta and the knowledge of the initial-state total momentum. These quantities coincide with Z properties for signal events and typically correspond to undetected SM particles for background. We use the recoil mass squared M 2 recoil since this quantity has a smoother distribution than M recoil for low masses. We select events with exactly two charged particles identified as muons, and negligible additional activity in the detector. The dominant backgrounds are processes which produce two muons and missing energy. These are primarily e + e − → µ + µ − (γ) events with one or more undetected photons, e + e − → τ + τ − (γ) events with both τ leptons decaying to muons and neutrinos, and e + e − → e + e − µ + µ − events (dominated by two-photon fusion production) with electrons outside the detector acceptance.
We extract the signal yield from a fit to the twodimensional distribution of M 2 recoil and the polar angle θ recoil of the recoil momentum with respect to the detector axis. Control samples are used to check simulation predictions and to infer correction factors. Selections are optimized using simulated events prior to examining data. However, one of the corrections based on control samples was derived after observing a discrepancy in the data with respect to the simulation.
The Belle II detector [28] operates at the interaction region of the SuperKEKB electron-positron collider [29], located at KEK in Tsukuba, Japan. The energies of the electron and positron beams are 7 GeV and 4 GeV with a boost of the c.m. frame βγ = 0.28 relative to the laboratory frame. The detector consists of several subdetectors arranged around the beam pipe in a cylindrical structure. Subdetectors relevant for this analysis are briefly described here in order from innermost out; more details are given in Refs. [28,30]. The innermost component is the vertex detector, consisting of two inner layers of silicon pixels and four outer layers of silicon strips. The second pixel layer is partially installed, covering one sixth of the azimuthal angle. The main tracking subdetector is a large helium-based small-cell drift chamber. An electromagnetic calorimeter (ECL) consists of a barrel and two endcaps made of CsI(Tl) crystals. A superconducting solenoid provides a 1.5 T magnetic field. A K 0 L and muon subdetector is made of iron plates providing the magnetic flux-return yoke, alternated with resistive-plate chambers and plastic scintillators in the barrel, and with plastic scintillators only in the endcaps. The longitudinal and transverse directions, and polar angle θ are defined with respect to the detector's cylindrical axis in the direction of the electron beam. In the following, quantities are defined in the laboratory frame unless otherwise specified.
Particle identification is implemented through the definition of likelihoods for each charged particle hypothesis by combining information from all the subdetectors. Identification of muons relies mostly on charged-particle penetration depth in the muon detector for momenta larger than 0.7 GeV/c and on information from the drift chamber and ECL otherwise. The ratio between the muon likelihood and the sum of the likelihoods of all particle hypotheses is required to be greater than 0.5. This retains 93-99% of muons, and rejects 80-97% of pions, depending on their momenta. Electrons, used in controlsample studies, are identified primarily by comparing momenta with energies of associated ECL depositions, with a similar likelihood-ratio method. Photons are reconstructed from ECL depositions with energy greater than 100 MeV that are not associated with any track.
The search uses an online event selection (trigger) that requires events with at least one pair of tracks in a restricted polar-angle acceptance, θ ∈ [37, 120] • , and an azimuthal opening angle larger than 90 • or 30 • for data collected in 2019 and 2020, respectively (two-track trigger). A dedicated trigger veto rejects events consistent with Bhabha scattering.
In the offline analysis, we require that tracks originate from the interaction point, with transverse and longitudinal projections of their distance of closest approach smaller than 0.5 and 2.0 cm, respectively, to reject spurious and beam-induced background tracks. We require events to have exactly two oppositely charged particles identified as muons, that pass the trigger requirements and have transverse momenta larger than 0.4 GeV/c. We reject events with opening angles between the muons in the c.m. frame larger than 179 • , to suppress µ + µ − (γ) backgrounds, which typically produce back-to-back muons. For M 2 recoil < 4 GeV 2 /c 4 , most of the µ + µ − (γ) background comes from events with single-photon emission: we require θ recoil to be within the ECL barrel acceptance [34,123] • so as to exclude regions where photons can escape undetected. Additionally, we reject events with θ recoil ∈ [89, 91] • , where there is a 1.5 mm-wide gap in the ECL instrumentation. For M 2 recoil > 4 GeV 2 /c 4 , µ + µ − (γ) background arises predominantly from multiple photon emission. In this case the recoil direction does not coincide with the direction of the lost photons, so we instead require θ recoil < 123 • because the signal is dominantly produced in the forward direction due to the c.m. boost. To suppress µ + µ − (γ) backgrounds, we impose a photon veto: we require the total energy of all photons to be less than 0.5 GeV and no photon to be within 15 • of the recoil momentum. To suppress µ + µ − (γ) and e + e − µ + µ − backgrounds, we require that the transverse recoil momentum in the c.m. frame exceed 0.5 GeV/c.
After these selections, the remaining background comes dominantly from τ + τ − (γ) events with τ leptons decaying to muons or to pions misidentified as muons in the region M 2 recoil < 50 GeV 2 /c 4 , and from e + e − µ + µ − events elsewhere. The background from e + e − → µ + µ − (γ) is subleading across the entire mass range.
The final selection uses an artificial neural network, denoted as Punzi-net [40], trained on simulated signal and background events, and specifically designed to optimize a figure of merit [41] for all Z mass hypotheses simultaneously. We use as inputs the four kinematic variables, all defined in the c.m. frame, with the highest discriminating power: the transverse momentum of a muon with respect to the dimuon thrust axis [42,43]; the transverse momentum of the higher-energy muon with respect to the momentum direction of the lower-energy muon; the longitudinal momentum of the higher-energy muon with respect to the momentum direction of the lower-energy muon; and the transverse momentum of the dimuon system. The first three variables exploit mostly the kinematic properties of Z production through radiation from a final state muon, compared with τ + τ − (γ) events, in which the recoil momentum arises from neutrinos from τ decays. The fourth variable exploits the kinematic features of µ + µ − (γ) and e + e − µ + µ − backgrounds, which typically have low transverse momenta. The Punzi-net produces an output between 0 (background) and 1 (signal): we select events with an output larger than 0.5. Additional details are given in Ref. [40].
The resulting signal efficiency is typically 5%, nearly uniform as a function of M 2 recoil . The Punzi-net selection removes nearly all τ + τ − (γ) background for M 2 recoil < 50 GeV 2 /c 4 , with a sensitivity gain between 5 and 15, depending on the mass. The residual background comes dominantly from µ + µ − (γ) in the region M 2 recoil < 50 GeV 2 /c 4 and from a large irreducible contribution of e + e − µ + µ − for M 2 recoil > 50 GeV 2 /c 4 , where the Punzi-net selection has limited discriminating power.
The region M 2 recoil < 1 GeV 2 /c 4 is dominated by the µ + µ − (γ) process with a single photon emission. Above 1 GeV 2 /c 4 the µ + µ − (γ) process contributes mostly with events containing two radiated photons. Typically, one photon is collinear with the beams and outside the acceptance, while the other is emitted in the direction of one of the gaps between the barrel and the forward or backward ECL endcaps. For M 2 recoil in the 1-50 GeV 2 /c 4 range, this produces two distinctive bands in the θ c.m. recoil -M 2 recoil plane, where θ c.m. recoil is the polar angle of the recoil momentum in the c.m. frame. This feature is exploited in a two-dimensional fitting procedure, which incorporates the expected background shapes due to µ + µ − (γ) events, doubling the sensitivity relative to a one-dimensional fit.
We fit the data by maximizing a binned likelihood based on signal and background two-dimensional templates obtained from simulation. The parameter of interest is the signal cross section, with the background normalization determined by the fit. The M 2 recoil bin widths vary across the spectrum and are set to the signal M 2 recoil resolution. The binning in θ c.m. recoil is determined by the distribution of µ + µ − (γ) events and depends on M 2 recoil . The number of bins varies from one (for high M 2 recoil ) to five (for low M 2 recoil ). We perform a squared-mass scan in steps corresponding to one unit of signal M 2 recoil resolution, testing all simulated mass hypotheses. Each fit is performed in search windows of 20 M 2 recoil bins centered around each hypothesis. Uncorrelated systematic uncertainties on the signal and background shapes (see below) are included in the respective templates by introducing in each bin of the template Gaussian nuisance parameters constrained by the corresponding uncertainties. A frequentist procedure based on the profile-likelihood ratiot µ [44] is used to obtain 90% confidence-level (C.L.) intervals on the cross section. We use the pyhf software package [45] for inference and check the consistency of our results with simplified simulated samples.
We also consider the scenario in which Γ Z is not negligible, as expected for large α D values [15,46], and study one benchmark case that assumes Γ Z = 0.1M Z . We account for the nonzero width in the fitting procedure by changing the shape of the signal templates to Breit-Wigner distributions with the widths Γ Z convolved with Gaussian resolution functions. We use only one-dimensional M 2 recoil templates and enlarge the search windows to cover the sizable signal width.
Three control samples are used to validate the analysis and estimate systematic uncertainties. The µµγ control sample is obtained by reversing the photon-veto criteria thus requiring a photon of energy greater than 1 GeV within 15 • of the recoil momentum direction. The eµ and ee control samples are obtained by requiring one or both tracks to be identified as electrons and then applying all the other selection criteria.
The efficiency of the two-track trigger is studied with the eµ control sample with events collected by an ECLbased trigger, which is used as a reference. This requires that the total energy deposition in the barrel and forward endcap exceed 1 GeV. The two-track trigger efficiency increases from 89 to 93% as a function of M 2 recoil . We use the ee control sample to study the photon-veto performance. We select events with M 2 recoil < 1 GeV 2 /c 4 , since they come dominantly from Bhabha scattering with a single radiated photon. The results indicate that the photon-veto inefficiency in the backward barrel ECL is larger than that estimated in simulation. This study was performed after observing a large data-simulation disagreement in the signal region compatible with photonveto inefficiency. The photon-veto inefficiencies measured with the ee control sample are used to correct the expected µµ background.
We estimate systematic uncertainties on the signal efficiency and on the signal and background template shapes. The uncertainties on the template shapes independently affect each of the bins contained within the templates.
Uncertainties in selection efficiencies due to datasimulation mismodeling are studied by comparing data and simulation in the µµγ and eµ control samples in three M 2 recoil ranges: [−0. 5,9], [9,36], [36,81] GeV 2 /c 4 . The two control samples provide complementary coverage of the M 2 recoil range, with µµγ addressing the lower region and eµ covering the higher. Systematic uncertainties due to data-simulation mismodeling in the trigger, luminosity, tracking efficiency, muon identification, background cross sections, and effect of the selections are collectively evaluated through data-simulation comparison before the application of the Punzi-net. Systematic uncertainties due to the Punzi-net selection-efficiency differences in data and simulation are evaluated by studying its efficiencies, as they are indicators of the performances for the signal-like background component. The differences from unity of the data-to-simulation ratios of event yields before the Punzi-net application and of the Punzinet efficiencies in the three M 2 recoil ranges are summed in quadrature and found to be 2.7%, 6.5%, and 8.3%, respectively. These differences are assigned as systematic uncertainties on the signal efficiency.
The recoil mass resolution is studied using the µµγ sample. The width of the M 2 recoil distribution is 8% larger in data than in simulation. This translates to a systematic uncertainty of 10% on the signal template shape.
Systematic uncertainties due to background shapes are evaluated using the µµγ and eµ samples. We compute the standard deviation of the bin-by-bin data-tosimulation ratios of the number of events for each search window. To be conservative, we assign twice the largest of these standard deviations in each of the three M 2 recoil ranges as an uncertainty for the shape in the respective M 2 recoil ranges. We use the µµγ control sample for M 2 recoil up to 56 GeV 2 /c 4 and the eµ control sample above. The resulting uncertainties are 3.2%, 8.6%, and 25% in the three M 2 recoil ranges. Uncertainties on the background template shape from the photon-veto inefficiency are studied using the ee control sample and are on average 34% for M 2 recoil < 1 GeV 2 /c 4 , decreasing to 5% above 1 GeV 2 /c 4 . We assign a systematic uncertainty of 1% to the measured integrated luminosity [27].
The observed and expected M 2 recoil distributions are shown in Fig. 1. We find no significant excess of data above the expected background. The χ 2 value describing the goodness of the two-dimensional fit is acceptable for each test Z mass with the largest incompatibility corresponding to a p value of 0.05. The largest local significance is 2.8σ for M Z = 2.352 GeV/c 2 . The global significance of this excess after correcting for the lookelsewhere effect [47] is 0.7σ.  Fig. 2 as functions of M Z , along with the 1σ and 2σ bands of expected limits (the median limits from background-only simulated samples). We set upper limits as small as 0.2 fb. In addition, we show upper limits for the benchmark scenario in which we assume non-negligible Γ Z . Our upper limits are dominated by statistical uncertainties for M Z < 6 GeV/c 2 , where systematic uncertainties degrade them by less than 5%. Above 6 GeV/c 2 , upper limits are dominated by systematic uncertainties (mainly due to background shapes), degrading them by about 40%.
Cross section results are translated into 90% CL upper limits on the coupling g . In both fully invisible and vanilla models, we focus on the direct-search results and do not show constraints obtained from reanalyses of data from neutrino experiments [7,48,49]. Figure 3 presents limits in the fully invisible L µ − L τ model for the cases of negligible and non-negligible Γ Z . For the case of negligible Γ Z , these constraints hold for M Z < ∼ 6.5 GeV/c 2 . Above this mass, there is no value of α D that produces both a negligible width and B(Z → χχ) ≈ 1, given the values of g being probed. Numerical values in Fig. 3 can still be used, but need to be rescaled by 1/ B(Z → χχ), which depends on α D . We also show limits from NA64-e [25] and the previous Belle II search [26]. Our results are world-leading for direct searches of Z with masses above 11.5 MeV/c 2 . They are the first direct-search results to exclude at 90% C.L. the fully invisible-Z model as an explanation of the (g − 2) µ anomaly for 0.8 < M Z < 5.0 GeV/c 2 . Figure 4 presents limits in the vanilla L µ − L τ model. Our results are world leading for direct searches of Z in the mass range 11.5 to 211 MeV/c 2 . More stringent limits are from NA64-e [26] below 11 MeV/c 2 and from Belle [22], BABAR [21], and CMS [23] searches for Z → µ + µ − above 211 MeV/c 2 .
Additional plots, including indirect constraints from neutrino experiments and detailed numerical results, are provided in the Supplemental Material [50]. In summary, we search for an invisibly decaying Z boson in the process e + e − → µ + µ − Z using data corresponding to 79.7 fb −1 collected by Belle II at SuperKEKB in 2019-2020. We find no significant excess above the expected background and set 90% C.L. upper limits on the coupling g ranging from 3 × 10 −3 at low Z masses to 1 for a mass of 8 GeV/c 2 . These are world-leading directsearch results for Z masses above 11.5 MeV/c 2 in the fully invisible L µ − L τ model and for masses in the range 11.5 to 211 MeV/c 2 in the vanilla L µ − L τ model. These limits are the first direct-search results excluding a fully invisible-Z -boson model as an explanation of the (g −2) µ anomaly for 0.8 < M Z < 5.0 GeV/c 2 .
We thank Andreas Crivellin for useful discussions during the preparation of this manuscript. This work, based on data collected using the Belle II detector, which was built and commissioned prior to This material is submitted as supplementary information for the Electronic Physics Auxiliary Publication Service We provide a text file with numerical results for the observed 90% CL upper limit on the cross section of e + e − → µ + µ − Z , with Z → invisible as well as of the observed 90% CL upper limit on g as functions of M Z .  Existing limits from BaBar [21], Belle [22], CMS [23] (95% CL), NA64-e [25], and Belle II [26] are shown, along with constraints (95% CL) derived from the trident production in neutrino experiments [7,48,49]. The red band shows the region that could explain the muon anomalous magnetic moment (g − 2)µ ± 2σ [2].  Existing limits from NA64-e [25] and Belle II [26] are shown, along with constraints (95% CL) derived from the trident production in neutrino experiments [7,48,49]. The vertical dashed line indicates the limit beyond which the hypothesis B(Z → χχ) ≈ 1 is not valid in the negligible Γ Z case. The red band shows the region that could explain the muon anomalous magnetic moment (g − 2)µ ± 2σ [2]. Existing limits from NA64-e [25] and Belle II [26] are shown, along with constraints (95% CL) derived from the trident production in neutrino experiments [7,48,49]. The vertical dashed line indicates the limit beyond which the hypothesis B(Z → χχ) ≈ 1 is not valid in the negligible Γ Z case. The red band shows the region that could explain the muon anomalous magnetic moment (g − 2)µ ± 2σ [2]. Existing limits from BaBar [21], Belle [22], CMS [23] (95% CL), NA64-e [25], and Belle II [26] are shown, along with constraints (95% CL) derived from the trident production in neutrino experiments [7,48,49]. The red band shows the region that could explain the muon anomalous magnetic moment (g − 2)µ ± 2σ [2].