Search for a $\mu^+\mu^-$ resonance in four-muon final states at Belle II

We report on a search for a resonance $X$ decaying to a pair of muons in $e^{+}e^{-}\rightarrow \mu^+ \mu^- X$ events in the 0.212-9.000 GeV/$c^{2}$ mass range, using 178 fb$^{-1}$ of data collected by the BelleII experiment at the SuperKEKB collider at a center of mass energy of 10.58 GeV. The analysis probes two different models of $X$ beyond the standard model: a $Z^{\prime}$ vector boson in the $L_{\mu}-L_{\tau}$ model and a muonphilic scalar. We observe no evidence for a signal and set exclusion limits at the 90$\%$ confidence level on the products of cross section and branching fraction for these processes, ranging from 0.046 fb to 0.97 fb for the $L_{\mu}-L_{\tau}$ model and from 0.055 fb to 1.3 fb for the muonphilic scalar model. For masses below 6 GeV/$c^{2}$, the corresponding constraints on the couplings of these processes to the standard model range from 0.0008 to 0.039 for the $L_{\mu}-L_{\tau}$ model and from 0.0018 to 0.040 for the muonphilic scalar model. These are the first constraints on the muonphilic scalar from a dedicated search.


I. INTRODUCTION
The standard model (SM) of particle physics is a highly predictive theoretical framework describing fundamental particles and their interactions.Despite its success, the SM is known to provide an incomplete description of nature.For example, it does not address the phenomenology related to dark matter, such as the observed relic density [1].In addition, some experimental observations show inconsistencies with the SM.Prominent examples include the longstanding difference between the measured and the expected value of the muon anomalous magneticmoment (g − 2) µ [2][3][4], possibly reduced by expectations based on lattice calculations [5], and the tensions in flavor observables reported by the BABAR, Belle, and LHCb experiments [6][7][8].Some of these observations can be explained with the introduction of additional interactions, possibly lepton-universality-violating, mediated by non-SM neutral bosons [9][10][11].Examples include the L µ − L τ extension of the SM and a muonphilic scalar model.
The L µ − L τ extension of the SM [12][13][14] gauges the difference between the muon and the τ -lepton numbers, giving rise to a new massive, neutral vector boson, the Z ′ .Among the SM particles, this particle couples only to µ, τ , ν µ , and ν τ , with a coupling constant g ′ .The Z ′ could also mediate interactions between SM and dark matter.
The muonphilic scalar S is primarily proposed as a solution for the (g − 2) µ anomaly [15][16][17][18].This particle couples exclusively to muons through a Yukawa-like interaction, which is not gauge-invariant under the SM gauge symmetry and may arise from a high-dimension operator term at a mass scale beyond the SM.In contrast to the L µ − L τ model, the muonphilic scalar model needs a high-energy completion.
Searches for a Z ′ decaying to muons have been reported by the BABAR [19], Belle [20], and CMS [21] Collaborations.An invisibly decaying Z ′ has been searched for by the Belle II [22,23] and NA64-e [24] experiments.The Belle II experiment also searched recently for a Z ′ decaying to τ + τ − [25].Constraints on the existence of a muonphilic scalar have been obtained by reinterpretations of Z ′ searches into muons [18].However, important experimental details may be unaccounted for in these reinterpretation studies, including the significantly different kinematic properties of the signal and the corresponding variation of the efficiency.
Here we report a search for the process e + e − → µ + µ − X, with X → µ + µ − , where X indicates Z ′ or S. The signal signature is a narrow enhancement in the mass distribution of oppositely charged muons M (µµ) in e + e − → µ + µ − µ + µ − events.We use data collected by the Belle II experiment at a center-of-mass (c.m.) energy √ s corresponding to the mass of the Υ(4S) resonance.The L µ − L τ model is used as a benchmark to develop the analysis; we then apply the same selections to the muonphilic scalar model and evaluate the performance.In both models, the X particle is at leading order emitted as final-state radiation (FSR) from one of the muons, as shown in Fig. 1.For the range of couplings explored in this study, the lifetime of X is negligible compared to the experimental resolution.The analysis techniques are optimized using simulated events prior to examining data.
We select events with exactly four charged particles with zero net charge, where at least three are identified as muons, with an invariant mass M (4µ) close to √ s/c 2 , and with negligible detected energy in addition to that associated to the charged particles.The dominant, non-peaking background is the SM e + e − → µ + µ − µ + µ − process, whose main production diagrams are shown in Fig. 2. The analysis uses kinematic variables combined with a multivariate technique to enhance the signal-tobackground ratio.A kinematic fit improves the dimuon mass resolution.The signal yield is extracted through a series of fits to the M (µµ) distribution, which allows an estimate of the background directly from data.
The paper is organized as follows.In Sec.II we briefly describe the Belle II experiment.In Sec.III we report the datasets and the simulation used.In Sec.IV we  present the event selections.In Sec.V we describe the signal modeling and the fit technique to extract the signal.In Sec.VI we discuss the systematic uncertainties.In Sec.VII we describe and discuss the results.Section VIII summarizes our conclusions.

II. THE BELLE II EXPERIMENT
The Belle II detector [26,27] consists of several subdetectors arranged in a cylindrical structure around the e + e − interaction point.The longitudinal direction, the transverse plane, and the polar angle θ are defined with respect to the detector's cylindrical axis in the direction of the electron beam.
Subdetectors relevant for this analysis are briefly described here in order from innermost out; a full description of the detector is given in Refs.[26,27].The innermost subdetector is the vertex detector, which consists of two inner layers of silicon pixels and four outer layers of silicon strips.The second pixel layer was only partially installed for the data sample we analyze, covering one sixth of the azimuthal angle.The main tracking subdetector is a large helium-based small-cell drift chamber.The relative charged-particle transverse momentum res-olution, ∆p T p T , is typically 0.1%p T ⊕ 0.3%, with p T expressed in GeV/c.Outside of the drift chamber, time-ofpropagation and aerogel ring-imaging Cherenkov detectors provide charged-particle identification in the barrel and forward end cap region, respectively.An electromagnetic calorimeter consists of a barrel and two end caps made of CsI(Tl) crystals: it reconstructs photons and identifies electrons.A superconducting solenoid, situated outside of the calorimeter, provides a 1.5 T magnetic field.A K 0 L and muon subdetector (KLM) is made of iron plates, which serve as a magnetic flux-return yoke, alternated with resistive-plate chambers and plastic scintillators in the barrel and with plastic scintillators only in the end caps.In the following, quantities are defined in the laboratory frame unless specified otherwise.

III. DATA AND SIMULATION
We use a sample of e + e − collisions produced at c.m. energy √ s = 10.58GeV in 2020-2021 by the Su-perKEKB asymmetric-energy collider [28] at KEK.The data, recorded by the Belle II detector, correspond to an integrated luminosity of 178 fb −1 [29].
Simulated signal e + e − → µ + µ − Z ′ with Z ′ → µ + µ − and e + e − → µ + µ − S with S → µ + µ − events are generated using MadGraph5_aMC@NLO [30] with initial-state radiation (ISR) included [31].Two independent sets of Z ′ events are produced, with Z ′ masses, m Z ′ , ranging from 0.212 GeV/c 2 to 10 GeV/c 2 in steps of 250 MeV/c 2 , to estimate efficiencies, define selection requirements, and develop the fit strategy, and in steps of 5 MeV/c 2 , exclusively dedicated to the training of the multivariate analysis.Samples of S events are generated in 40 MeV/c 2 steps for m S masses between 0.212 GeV/c 2 and 1 GeV/c 2 and in 250 MeV/c 2 steps from 1 GeV/c 2 to 10 GeV/c 2 .
The detector geometry and interactions of final-state particles with detector material are simulated using Geant4 [42] and the Belle II software [43,44].

IV. SELECTIONS
The selection requirements are divided into four categories: trigger, particle identification, candidate selections, and final background suppression.

A. Trigger selections
We filter events selected by the logical OR of a threetrack trigger and a single-muon trigger.The efficiency of both triggers is measured using a reference calorimeteronly trigger, which requires a total energy deposit above 1 GeV in the polar angle region 22 • < θ < 128 • .We require a single electron of sufficient energy to activate the calorimeter trigger.The three-track trigger requires the presence of at least three tracks with 37 • < θ < 120 • .The efficiency of this trigger is measured in four-track events containing at least two pions and one electron and depends on the transverse momenta p T of the two charged particles with lowest transverse momenta, reaching a plateau close to 100% for p T above 0.5 GeV/c.The single-muon trigger is based on the association of hits in the barrel KLM with geometrically matched tracks extrapolated from the inner tracker.The efficiency of this trigger is measured in a sample of two-track events with one electron and one muon, mostly from the e + e − → τ + τ − process, reaching a plateau of about 90% in the polar angle range 51 • < θ < 117 • .The efficiency for events with multiple muons is computed using the singlemuon efficiency assuming no correlation.The overall trigger efficiency is 91% for m Z ′ close to the dimuon mass, increases smoothly to a plateau close to 99% in the mass range 2.5-8.5 GeV/c 2 , and then drops to 89% at 10 GeV/c 2 .It is slightly higher, 95%, for low masses in the S case, due to the harder spectrum of the muonphilic scalar (see Sec. IV E).

B. Particle identification
The identification of muons relies mostly on chargedparticle penetration in the KLM for momenta larger than 0.7 GeV/c and on information from the drift chamber and the calorimeter otherwise.The selection retains 93%-99% of the muons, and rejects 80%-97% of the pions, depending on their momenta.Electrons are identified mostly by comparing measured momenta to the energies of the associated calorimeter deposits.Photons are reconstructed from calorimeter energy deposits greater than 100 MeV that are not associated with any track.Details of particle reconstruction and identification algorithms are given in Refs.[27,45].

C. Candidate selections
We require that events have exactly four charged particles with zero net charge and invariant mass M (4µ) between 10 GeV/c 2 and 11 GeV/c 2 .To suppress backgrounds from misreconstructed and single-beam induced tracks, the transverse and longitudinal projections of the distance of closest approach to the interaction point of the tracks must be smaller than 0.5 cm and 2.0 cm, respectively.At least three of the tracks must be identified as muons.This requirement provides better performance than requiring four identified muons or a pair of samesign muons.It rejects almost all backgrounds other than e + e − → µ + µ − µ + µ − , while retaining good efficiency for signal.
In the low dimuon-mass region below 1 GeV/c 2 , there are residual backgrounds from e + e − → µ + µ − γ, in which the photon converts to an electron-positron pair, and e + e − → e + e − µ + µ − events.Some of these electrons that are misidentified as muons have low momenta, and thus do not reach the KLM.The remaining electrons leave signals in the KLM at the gap between the barrel and end cap or in the gaps between adjacent modules.In this mass region, we therefore require that no track be identified as an electron.
To suppress radiative backgrounds and, in general, backgrounds with neutral particles, we require that the total energy of all photons be less than 0.4 GeV.
We add a further requirement when M (4µ) < 10.4 GeV/c 2 , exploiting the correlation between the invariant mass and the initial state radiation.This additional selection requires the total energy of all photons to be less than the expected energy of a single irradiated photon, which depends linearly on M (4µ).
In addition, we reject events in which the angle in the c.m. frame between the momentum of the four-muon system and that of the system composed of all the photons is larger than 160 • .
At this level of the analysis, there is no a priori attempt to select a single µ + µ − pair as a candidate X decay.Each event includes four possible µ + µ − candidates, each with a different dimuon mass M (µµ), causing some combinatorial background.For each µ + µ − candidate, the pair of the two remaining muons is labeled as the "recoil" pair.We consider independently all the µ + µ − candidates, each with its recoil muons.
The resulting candidate M (µµ) distribution is shown in Fig. 3.The average data-to-simulation yield ratio is 0.76, due to the lack of ISR in the AAFH four-muon generator, in agreement with the values previously reported by BABAR [19] and Belle [20].The excess of the simulation over data in the mass region below 2 GeV/c 2 is also due to an overestimate of the three-track-trigger efficiency for very low transverse-momentum tracks.Specifically, the enhancement in the range 1-2 GeV/c 2 originates from the process e + e − → µ + µ − γ with a near-beam-energy photon, followed by conversion of the photon into electronpositron pairs in detector material.These events are almost entirely removed by the final background suppression.Other visible features include the unsimulated contributions from the ρ, J/ψ, and Υ(1S) resonances.

D. Final background suppression
The final selection relies on a few distinctive features that allow the discrimination of signal from background: signal events include a µ + µ − resonance, which can be seen both in the candidate muon pair and in the mass of the system recoiling against the two recoil muons; the signal is emitted through FSR from a muon (Fig. 1), while the dominant four-muon background proceeds through double-photon-conversion process (Fig. 2, left); and the double-photon-conversion process has a distinctive momentum distribution.In the following, some of the relevant variables sensitive to these three classes of features are discussed: they are based both on the µ + µ − candidate, where we search for signal, and on the recoil muons.For illustration, we show the case for a Z ′ signal with m Z ′ = 3 GeV/c 2 and for background, both with reconstructed candidate dimuon masses 2.75 < M (µµ) < 3.25 GeV/c 2 .The background in this mass region is dominated by the e + e − → µ + µ − µ + µ − process, see Fig. 3.
Magnitudes of the two candidate muon momenta, p µ + and p µ − , and their correlations are sensitive to the presence of a resonance (Fig. 4).Signal events cluster preferentially in the central part of the distribution, while background predominantly populates the extremes.A similar effect occurs for the momenta of the two recoil muons, p rec µ + and p rec µ − (Fig. 5), which provide instrumentally uncorrelated access to the same information, though with a different resolution.The cosine of the helicity angle of the candidate-muon pair cos ϕ hel , defined as the angle between the momentum direction of the c.m. frame and the µ − in the candidate-muon-pair frame, has a uniform distribution for a scalar or an unpolarized massive vector decaying to two fermions, but not for the background processes (Fig. 6).The slight departure from uniformity in the signal case is due to momentum resolution, which smears the determination of the boost to the muon-pair frame.
The double-photon-conversion process (Fig. 2, left) accounts for 80% of the four-muon background cross section.It also includes the case of off-shell photon emission (and subsequent dimuon production) from one of the initial-state electrons, ISR double-photon conversion, which contributes mainly in the low mass region.The annihilation process (Fig. 2, right) is very similar to the signal process and constitutes an nearly irreducible background: it accounts for 20% of the cross section for M (µµ)<1 GeV/c 2 and for 10% above.Transverse projections of the candidate-muon-pair momentum p µµ on the direction of the recoil muon with minimum momentum, p T (p µµ , p rec min ), and on the direction of the recoil muon with maximum momentum, p T (p µµ , p rec max ), are sensitive to FSR emission (Fig. 7).This is because, in case of signal, these are the transverse momenta of X with respect to the direction of the muon from which it was emitted, and with respect to the direction of the other muon.We assign to the transverse projection p T (p µµ , p rec min ) the sign of the longitudinal projection, since this slightly increases the discriminating power.The transverse momentum of the candidate muon pair with respect to the z axis, p T (p µµ , z), which approximates the beam direction, is sensitive to the ISR double-photon conversion mechanism of emission because p T (p µµ , z) is the transverse momentum of the muon pair with respect to the initialstate-electron direction.This variable is shown in Fig. 8 in a two-dimensional distribution versus p T (p µµ , p rec min ) to illustrate the correlation between variables sensitive to ISR and FSR, respectively.
The double photon conversion process produces two muon pairs from two off-shell photons.The dominant background at a mass m 0 is produced when one pair has M (µµ) near m 0 and the other pair has a mass at the lowest possible value above 2m µ .In these cases, the c.m. momentum p 0 of the two pairs can be analytically calculated.In e + e − → µ + µ − µ + µ − background events the dimuon c.m. momentum p µµ peaks at p 0 , in contrast to the signal, at least for two of the dimuon candidates.This difference is visible in Fig. 9.
We select sixteen discriminating variables: the magnitude of the candidate-muon-pair momentum p µµ ; the absolute value of the cosine of the helicity angle in the candidate-muon-pair rest frame; the magnitudes of the candidate-single-muon momenta; the candidate-singlemuon transverse momenta; the magnitudes of the recoil-   tion of p T (p µµ , p rec min ) with p T (p µµ , z); and the transverse projections of the recoil-muon-pair momentum on the directions of the momenta of the candidate muons with minimum and maximum momentum.All variables other than the helicity angle are defined in the c.m. frame.
We use multilayer perceptron (MLP) artificial neural networks [46] with 16 input neurons, fed with the discriminant variables, and with one output neuron.The MLPs are developed using simulated Z ′ and simulated background events.To improve performance, we use five separate MLPs in different M (µµ) intervals, which we refer to as MLP ranges: 0.21-1.00GeV/c 2 , 1.00-3.75GeV/c 2 , 3.75-6.25GeV/c 2 , 6.25-8.25 GeV/c 2 , and 8.25-10.00GeV/c 2 .Within the MLP ranges, better performances are obtained if the dependence of the input variables on m Z ′ is reduced.This is achieved by scaling the momentum-dimensioned variables by p 0 , which is the maximum c.m. momentum of the two muon pairs.To ensure that MLPs are not biased to specific mass values, we use a training signal sample that has mass steps of 5 MeV/c 2 , so as to approximate a continuous distribution.For nearly all masses, the most discriminating variable is p µµ , followed by the correlation of p µ + and p µ − .The selection applied on the MLP output is studied separately in each MLP range, by maximizing the figure of merit described in Ref. [47], and then expressed as a function of M (µµ) by interpolation.The background rejection factor achieved by the MLP selection varies from 2.5 to 14, with the best value around 5 GeV/c 2 .The resulting background is composed almost entirely of e + e − → µ + µ − µ + µ − events, with e + e − → µ + µ − (γ) and e + e − → e + e − µ + µ − processes contributing only below 1 GeV/c 2 .The MLP selection is applied separately  to each of the four candidates per event, reducing the average candidate multiplicity per background event to 1.7.The average candidate multiplicity per signal event varies between 1.4 and 3, depending on the mass.

E. Efficiencies and dimuon spectrum
The efficiencies of the full selection for the L µ −L τ and muonphilic scalar models are shown in Fig. 10.The S boson, due to angular momentum conservation, is produced through a p-wave process, and has a higher momentum spectrum than the Z ′ , which is produced via an s-wave process.For masses below 1 GeV/c 2 , this implies the presence of higher momentum muons in the case of the scalar, which are better identified and detected with higher efficiency.For masses above 1 GeV/c 2 , the muon identification efficiencies for S and Z ′ are simi- lar, and the higher signal efficiencies for the Z ′ are due to the differences in the distributions of the momentumdimensioned input variables, with the MLP optimized for the L µ − L τ model.The signal efficiencies shown here are corrected for ISR.Although the signal generator includes ISR, it does not include the large-angle hardradiation component that can produce photons in the acceptance, and thereby veto events.This effect is studied using e + e − → µ + µ − γ events, generated with KKMC that simulates ISR in a complete way.We require the dimuon mass in the range 10-11 GeV/c 2 , to emulate the selection we apply on M (4µ), that intrinsically limits the maximum energy at which a photon can be radiated.Applying the selection on the photons (see Sec. IV) gives a relative reduction of 2.8% in efficiency.To improve the m X resolution, a kinematic fit is applied requiring that the sum of the four-momenta of the muons be equal to the four-momentum of the c.m. system, thereby constraining the four-muon invariant mass to √ s/c 2 .The resulting M (µµ) distribution is shown in Fig. 11.With the exception of the very low mass region, the data-to-simulation yield ratio is generally above one.This is because the MLPs perform worse on data, which naturally includes ISR, than on background simulation, which does not.This is not the case for the signal, which is simulated with the ISR contribution.Also visible in Fig. 11 are modulations originating from the five MLP ranges.Neither of these effects produce narrow peaking structures at the scale of the signal resolution, 2-5 MeV/c 2 (Sec.V).As in Fig. 3, contributions from the unsimulated ρ, J/ψ, and Υ(1S) resonances are visible.

V. SIGNAL MODELING AND FIT
To search for the signal, we use the reduced dimuon mass M R ≡ M 2 (µµ) − 4m 2 µ , which has smoother behavior than the dimuon mass near the kinematic threshold.The reduced-mass resolution is 2-2.5 MeV/c 2 for m Z ′ below 1 GeV/c 2 , increases smoothly to 5 MeV/c 2 for m Z ′ around 5 GeV/c 2 , then decreases to 2.5 MeV/c 2 at 9 GeV/c 2 .
The signal yields are obtained from a scan over the M R spectrum through a series of unbinned maximum likelihood fits.The signal M R distributions are parameterized from the simulation as sums of two Crystal Ball functions [48] sharing the same mean.The background is described with a quadratic function with coefficients as free parameters in the fit for masses below 1 GeV/c 2 , and with a straight line above.Higher-order polynomials are investigated, but their corresponding fitted coefficients are compatible with zero over the full mass spectrum.The broad ρ contribution is accommodated by the quadratic fit.
The scan step-size is set equal to the mass resolution, which is sufficient to detect the presence of a X resonance regardless of its mass.The fit interval is 60 times the mass resolution, following an optimization study.A total of 2315 fits are performed, covering dimuon masses from 0.212 GeV/c 2 to 9 GeV/c 2 .If a fitting interval extends over two different MLP ranges, we use the MLP corresponding to the central mass.We exclude the dimuon mass interval 3.07-3.12GeV/c 2 , which corresponds to the J/ψ mass.The Υ(1S) peak is beyond the mass range of the search.The fit yields are scaled by 7% to account for a bias estimated in a study of the J/ψ in an e + e − µ + µ − control sample, which obtains a width 25% larger than in simulated signals of that mass.Propagating this 25% degradation in resolution to all masses gives an average yield bias of 7%.This is also included as a systematic uncertainty (Sec.VI).
Signal yields from the fits are then converted into cross sections, after correcting for signal efficiency and luminosity.

VI. SYSTEMATIC UNCERTAINTIES
Several sources of systematic uncertainties affecting the cross-section determination are taken into account: these include signal efficiency, luminosity, and fit procedure.
Uncertainties due to the trigger efficiency in signal events are evaluated by propagating the uncertainties on the measured trigger efficiencies.They are 0.3% for most of the mass spectrum, increasing to 1.7% at low masses and 0.5% at high masses.
Uncertainties due to the tracking efficiency are estimated in e + e − → τ + τ − events in which one τ decays to a single charged particle and the other τ to three charged particles.The relative uncertainty on the signal efficiency is 3.6%.
Uncertainties due to the muon identification requirement are studied using e + e − → µ + µ − γ, e + e − → e + e − µ + µ − events, and final states with a J/ψ.The relative uncertainty on the signal efficiency varies between 0.7% and 3%, depending on the X mass.
Beam backgrounds in the calorimeter can accidentally veto events due to the requirements on photons (Sec.IV C).The effect is studied by changing the level of beam backgrounds in the simulation and by varying the photon energy requirement (see Sec. IV) according to the calorimeter resolution.The relative uncertainty on the signal efficiency due to this source is estimated to be below 1%.
To evaluate uncertainties due to the data-to-simulation discrepancies in MLP selection efficiencies, we apply a tight selection on M (4µ) around √ s/c 2 requiring it to be in the range 10.54-10.62GeV/c 2 .With this selection, data and background simulation are more directly comparable, because ISR and FSR effects are much less important.We compare MLP efficiencies, defined as the ratio of the number of events before and after the MLP selection, in data and simulation and assume that the uncertainties estimated in those signal-like conditions are representative of signal.We also assume that these uncertainties hold in the full M (4µ) interval 10-11 GeV/c 2 for the signal, which is generated with ISR.The differences found in each MLP range vary between 1.1% and 8.1%, which are taken as estimates of the systematic uncertainties.To exclude potential bias from the presence of a signal, we check that these differences do not change if we exclude, in each MLP range and for each of the 2315 mass points, intervals ten times larger than the signal mass resolution around the test masses.
Uncertainties due to the interpolation of the signal efficiency between simulated points are estimated to be 3%, which is assigned as a relative uncertainty on the signal efficiency.
Uncertainties due to the fit procedure, in addition to that arising from mass resolution, are evaluated using a bootstrap technique [49].A number of simulated signal events corresponding to the yield excluded at 90% confidence level are overlaid on simulated background and fitted for each Z ′ mass.The distribution of the difference between the overlaid and the fitted yields, divided by the fit uncertainty, shows a negligible average bias with a width that deviates from one by 4%, which is assigned as a relative uncertainty on the signal-yield determination.Additional uncertainties related to the fit procedure are those due to the mass resolution, discussed in Sec.V.An uncertainty of 7%, equal to the average yield bias, is included.Systematic uncertainties from data-to-simulation differences in momentum resolution and beam-energy shift are found to be negligible, due to the kinematic fitting procedure.Finally, the integrated luminosity has a systematic uncertainty of 1% [29].
The uncertainties are summed in quadrature to give a total that ranges from 9.5% to 12.9% depending on the X mass.The contributions to the systematic uncertainty are summarized in Table I.The systematic uncertainties are included as nuisance parameters with Gaussian constraints on the signal efficiency, with widths equal to the estimated systematic uncertainties.

VII. RESULTS
The significance of signal over background for each fit is evaluated as 2 log(L/L 0 ), where L and L 0 are the likelihoods of the fits with and without signal.The largest local one-sided significance observed is 3.4σ at M (µµ) = 5.307 GeV/c 2 , corresponding to a 1.6σ global significance after taking into account the look-elsewhere effect [50,51].The corresponding fit is shown in Fig. 12.Three additional mass points have local significances that exceed 3σ.They are at M (µµ) masses of 1.939 GeV/c 2 , 4.518 GeV/c 2 , and 4.947 GeV/c 2 , with global significances of 0.6σ, 1.2σ, and 1.1σ, respectively.
Since we do not observe any significant excess above the background, we derive 90% confidence level (CL) upper limits (UL) on the process cross sections σ(e + e − → µ + µ − X) × B(X → µ + µ − ) separately for Z ′ and S (Fig. 13), using the frequentist procedure CL S [52].The expected limits in Fig. 13 are the median limits from background-only simulated samples that use yields from fits to data.We obtain upper limits ranging from 0.046 fb to 0.97 fb for the L µ − L τ model, and from 0.055 fb to 1.3 fb for the muonphilic scalar model.These upper limits are dominated by sample size, with systematic uncertainties worsening them on average by less than 1%.The cross-section results are translated into upper limits on the coupling constant g ′ of the L µ − L τ model and on the coupling constant g S of the muonphilic scalar model (Fig. 14).For masses below 6 GeV/c 2 , they range from 0.0008 to 0.039 for the L µ − L τ model and from 0.0018 to 0.040 for the muonphilic-scalar model.These limits exclude the L µ − L τ model and the muonphilic scalar model as explanations of the (g − 2) µ anomaly for 0.8 < m Z ′ < 4.9 GeV/c 2 and 2.9 < m S < 3.5 GeV/c 2 , respectively.Our constraints on g ′ are similar to those set by BABAR [19] for m Z ′ above 1 GeV/c 2 and to those set by Belle [20] on the full m Z ′ spectrum, both based on much larger integrated luminosities than ours.For the muonphilic scalar model, we do not show the constraints in Ref. [18], since they may not take into account all the experimental details affecting the signal efficiency, particularly those related to the higher momentum spectrum compared to the Z ′ .Numerical results of the cross-section and the coupling constant of the Z ′ and the S are available in the Supplemental Material [56].

VIII. CONCLUSION
We search for the process e + e − → µ + µ − X with X → µ + µ − , X = Z ′ , S in a data sample of electronpositron collisions at 10.58 GeV collected by Belle II at SuperKEKB in 2020 and 2021, corresponding to an integrated luminosity of 178 fb −1 .We find no significant excess above the background.We set upper limits on the cross sections for masses between 0.212 GeV/c 2 and 9 GeV/c 2 , ranging from 0.046 fb to 0.97 fb for the L µ − L τ model, and from 0.055 fb to 1.3 fb for the muonphilic scalar model.We derive exclusion limits on the couplings for the two different models.For masses below 6 GeV/c 2 , they range from 0.0008 to 0.039 for the L µ −L τ model and from 0.0018 to 0.040 for the muonphilic-scalar model.These limits exclude the L µ − L τ model and the muonphilic scalar model as explanations of the (g − 2) µ anomaly for 0.8 < m Z ′ < 4.9 GeV/c 2 and 2.9 < m S < 3.5 GeV/c 2 , respectively.These are the first results for the muonphilic scalar model based on a realistic evaluation of the signal efficiency that takes into account all the experimental details.This work, based on data collected using the Belle II detector, which was built and commissioned prior to   [22,23] for invisible Z ′ decays, and from BABAR [19], Belle [20], and CMS [21] (95% CL) searches for Z ′ decays to muons, along with constraints (95% CL) derived from the trident production in neutrino experiments [53][54][55].The red band in each panel shows the region that explains the muon anomalous magnetic moment (g − 2)µ ± 2σ.

Figure 3 :
Figure 3: Dimuon invariant-mass distribution in data and simulation for candidates passing all selections but the final background suppression.Contributions from the various simulated processes are stacked.The subpanel shows the datato-simulation ratio.

Figure 7 :
Figure 7: Candidate-muon-pair transverse momentum with respect the maximum momentum recoil-muon direction versus the candidate-muon-pair transverse momentum with respect to the minimum momentum recoil-muon direction (with the sign of the longitudinal projection) for simulated signal (left) with m Z ′ = 3 GeV/c 2 and simulated background (right), for dimuon masses 2.75 < M (µµ) < 3.25 GeV/c 2 .

Figure 10 :
Figure 10: Signal efficiency as a function of m Z ′ (purple dots) and mS (orange triangles) masses after all selections are applied.

Figure 11 :
Figure 11: Dimuon invariant-mass distribution in data and simulation for candidates passing all the selections.Contributions from the various simulated process are stacked.The subpanel shows the data-to-simulation ratio.

Figure 12 :
Figure 12: Fit for a Z ′ mass hypothesis of 5.307 GeV/c 2 , for which we obtain the maximum local significance of 3.4σ.

Figure 13 :
Figure13: Observed 90% confidence level upper limits and corresponding expected limits on the cross sections for the processes e + e − → µ + µ − X with X → µ + µ − , X = Z ′ , S, as functions of the Z ′ mass (top) and S mass (bottom).
March 2019, was supported by Higher Education and Science Committee of the Republic of Armenia Grant No. 23LCG-1C011; Australian Research Council and Research Grants No. DP200101792, No. DP210101900, No. DP210102831, No. DE220100462, No. LE210100098, and No. LE230100085; Austrian Federal Ministry of Education, Science and Research, Austrian Science Fund

Figure 14 :
Figure 14: Observed 90% CL upper limits and corresponding expected limits on (top) the Lµ − Lτ model coupling g ′ and on (bottom) the muonphilic scalar model coupling gS.Also shown in the top panel are constraints from Belle II[22,23] for invisible Z ′ decays, and from BABAR[19], Belle[20], and CMS[21] (95% CL) searches for Z ′ decays to muons, along with constraints (95% CL) derived from the trident production in neutrino experiments[53][54][55].The red band in each panel shows the region that explains the muon anomalous magnetic moment (g − 2)µ ± 2σ.

Table I :
Systematic uncertainties affecting the cross-section determination.
. P 34529, No. J 4731, No. J 4625, and No. M 3153, and Horizon 2020 ERC Starting Grant No. 947006 "In-terLeptons"; Natural Sciences and Engineering Research Council of Canada, Compute Canada and CANARIE; National Key R&D Program of China under Contract No. 2022YFA1601903, National Natural Science Foundation of China and Research Grants No. 11575017, No. 11761141009, No. 11705209, No. 11975076, No. 12135005, No. 12150004, No. 12161141008, and No. 12175041, and Shandong Provincial Natural Science Foundation Project ZR2022JQ02; the Czech Science Foundation Grant No. 22-18469S and Charles University Grant Agency project No. 246122; European Research Council, Seventh Framework PIEF-GA-2013-622527, Horizon 2020 ERC-Advanced Grants No. 267104 and No. 884719, Horizon 2020 ERC-Consolidator Grant