Search for lepton-flavor-violation in $Z$-boson decays with $\tau$-leptons with the ATLAS detector

A search for lepton-flavor-violating $Z\to e\tau$ and $Z\to\mu\tau$ decays with $pp$ collision data recorded by the ATLAS detector at the LHC is presented. This analysis uses 139 fb$^{-1}$ of Run 2 $pp$ collisions at $\sqrt{s}=13$ TeV and is combined with the results of a similar ATLAS search in the final state in which the $\tau$-lepton decays hadronically, using the same data set as well as Run 1 data. The addition of leptonically decaying $\tau$-leptons significantly improves the sensitivity reach for $Z\to\ell\tau$ decays. The $Z\to\ell\tau$ branching fractions are constrained in this analysis to $\mathcal{B}(Z\to e\tau)<7.0\times10^{-6}$ and $\mathcal{B}(Z\to \mu\tau)<7.2\times10^{-6}$ at 95% confidence level. The combination with the previously published analyses sets the strongest constraints to date: $\mathcal{B}(Z\to e\tau)<5.0\times10^{-6}$ and $\mathcal{B}(Z\to \mu\tau)<6.5\times10^{-6}$ at 95% confidence level.

Three lepton families (flavors) exist in the standard model of particle physics (SM) [1][2][3][4], and the number of leptons of each family is conserved in their interactions. Nevertheless, this conservation is not postulated by any fundamental principle of the theory, and neutrino oscillations [5,6] indicate that processes violating this conservation do occur in nature. According to current knowledge, lepton-flavor-violating (LFV) processes in charged-lepton interactions can occur via neutrino mixing but are too rare to be detected by current experiments [7]. An observation of these would be an unambiguous sign of physics beyond the SM. LFV processes occur, for example, in models predicting the existence of heavy neutrinos [8], which may also explain the observed tiny masses and large mixing of the SM neutrinos. In such models, up to one in 10 5 bosons would undergo an LFV decay involving leptons. In an earlier analysis, the ATLAS experiment at the LHC set the strongest constraints on the branching fractions (B) of the LFV decays of the boson involving a lepton by searching for such decays in which the lepton decays hadronically [9]. This result was achieved by analyzing proton-proton ( ) collision data corresponding to an integrated luminosity of 139 fb −1 at a center-of-mass energy √ = 13 TeV and 20.3 fb −1 at √ = 8 TeV. In that search, ATLAS measured the branching fractions to be B ( → ) < 8.1 × 10 −6 and B ( → ) < 9.5 × 10 −6 at 95% confidence level (C.L.), superseding former limits set by the LEP experiments of B ( → ) < 9.8 × 10 −6 [10] and B ( → ) < 1.2 × 10 −5 [11] at 95% C.L.
This Letter presents a complementary search for → ℓ decays (ℓ = light charged lepton, i.e. or ) in which the leptons decay into electrons or muons (ℓ ℓ channel) using 139 fb −1 of collision data at √ = 13 TeV collected by the ATLAS experiment [12][13][14]. The search is performed here for the first time at the LHC and is combined with the similar ATLAS search using hadronic -lepton decays (ℓ had channel) [9]. The two searches follow similar analysis strategies. Neural network classifiers are used for optimal discrimination of signal from backgrounds and their distributions are employed in a binned maximum-likelihood fit to achieve better sensitivity. ATLAS is a multipurpose particle detector with a forward-backward symmetric cylindrical geometry and a near 4 coverage in solid angle [12,15,16]. It consists of an inner tracking detector surrounded by a superconducting solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer based on superconducting air-core toroidal magnets. This search analyzes collision events recorded by the ATLAS experiment using single-electron or single-muon triggers [17][18][19]. Prompt electrons and muons from the -boson decays and those from the -lepton decays are reconstructed and selected in the same way. Candidates for electrons [20], muons [21], jets [22][23][24], and visible decay products of hadronic -lepton decays ( had-vis ) [25,26] are reconstructed from energy deposits in the calorimeters and charged-particle tracks measured in the inner detector and the muon spectrometer. These candidates are selected with sets of requirements similar to those used in Ref. [9]. Electron candidates are required to pass the medium likelihood-based identification requirement [20] and have a transverse momentum T > 15 GeV and a pseudorapidity | | < 1.37 or 1.52 < | | < 2.47. The latter selection vetoes electron candidates passing through the transition region between the barrel and end-cap electromagnetic calorimeters. Muon candidates are required to pass the medium identification requirement [27] and have a T > 10 GeV and | | < 2.5. Both the electron and muon candidates must satisfy the tight isolation requirement [20,27] , which is intended to reject misidentified candidates produced from the hadronization of quarks or gluons based on tracks and clusters reconstructed collinear to the candidates. Events with exactly one electron and one muon candidate are selected with the requirement that the lepton with higher transverse momentum has a T > 27 GeV. This selection lies above the threshold for constant efficiency of both single-lepton trigger selections. Events with same-flavor lepton pairs are rejected, in order to reduce the background from → ℓℓ decays. Events with a leading-T electron are used in the search for → decays ( channel), while those with a leading-T muon are used in the search for → decays ( channel), assuming the prompt lepton from the -boson decay is the leading one in T . In the channel, the ratio of the electron's T reconstructed in the inner tracking detector to the transverse energy reconstructed in the electromagnetic calorimeter, track T ( )/ cluster T ( ), is required to be smaller than 1.1 in order to reject → events. Opposite-charge lepton-pair events are analyzed in the search for signal events, while events with same-charge lepton pairs are used for estimates of background processes. Quark-or gluon-initiated particle showers (jets) are reconstructed using the anti-algorithm [22,23] with a radius parameter = 0.4. Jets fulfilling T > 20 GeV and | | < 2.5 are identified as containing hadrons if tagged by a dedicated multivariate algorithm [28]. To ensure the samples of selected events do not overlap with those used in the ℓ had channel, events with a had-vis candidate are vetoed. The had-vis candidates reconstructed from jets with a T > 10 GeV and with one or three associated tracks are selected in | | < 1.37 or 1.52 < | | < 2.5. The had-vis identification is performed by a recurrent neural network algorithm [25]. A had-vis candidate is required to have a T > 25 GeV and pass the tight identification selection. The missing transverse momentum ( miss T ) is calculated as the negative T sum of all fully reconstructed and calibrated physics objects [29,30]. Additionally, the calculation includes inner detector tracks that originate from the vertex associated with the hard-scattering process but are not associated with any of the reconstructed objects.
The → ℓ → ℓℓ + 2 signal events are characterized by a final state which has two light charged leptons with different flavor and opposite electric charge, two neutrinos and an invariant mass of all these particles compatible with the -boson mass. In most cases, these two leptons are emitted approximately back-to-back in the plane transverse to the proton beam direction. Since the lepton is typically boosted due to the large difference between its mass and the mass of its parent boson, the two neutrinos from its decay are usually almost collinear with the charged lepton from the -lepton decay. The dominant background contribution is from the lepton-flavor-conserving → → ℓℓ + 4 decays, where the two -leptons decay leptonically. Subleading background contributions from other SM processes with final states with two prompt leptons include the decays of a top-antitop-quark pair (¯), two gauge bosons (diboson), or a Higgs boson. Finally, small background contributions come from → ℓℓ decays, where one of the light charged leptons is misidentified with the wrong flavor, and events with "fake leptons". The latter type of background events includes mostly (→ ℓ )+jets events with leptons from heavy-flavor quark decays or with light-quark-initiated jets that are misidentified as electrons or muons. The signal and background events are separated by using a set of selection criteria that define a signal-enhanced sample, referred to as the signal region (SR). The selection criteria are listed in Table 1. Three neural network (NN) binary classifiers similar to those used in Ref. [9] are trained on simulated events to distinguish signal events from → , top-quark pair, and diboson background events individually. The input to these NNs is a mixture of low-and high-level kinematic variables, following the same strategy as in the ℓ had channel [9]. The low-level variables are the momentum components of the reconstructed electron and muon candidates, and the miss T . The high-level variables are kinematic properties of the --miss T system, such as the collinear mass coll ( , ), defined as the invariant mass of the --2 system, where the two neutrinos are assumed to have a vectorial momentum sum that is equal in T and the azimuthal angle around the beam axis to the measured miss T and equal in to the subleading-T lepton momentum. The outputs of the individual NNs (NN with values between zero and one) are combined into a final discriminant as shown in Eq. (1), hereafter referred to as the "combined NN output": Events classified by the NN trained for → as backgroundlike are excluded from the SR and used in a control region to better determine the → background in the maximum-likelihood fit (see Table 1).
No had-vis candidate Complementarity to the ℓ had channel.
Neural network (optimized for signal vs. → ) output > 0.2 Complementarity to the CRZ region.
In channel: track The signal acceptance in the SR is 19.5% for the channel and 11.2% for the channel, as determined from simulated signal samples. The lower acceptance in the channel is due to the higher T -threshold on the subleading-T lepton and the additional selection on track T / cluster T .
Predictions for signal and background contributions are based partly on Monte Carlo (MC) simulations and partly on estimates from data. Signal and background processes were simulated as in Ref. [9]. The signal events were simulated using P 8 [31] with matrix elements calculated at leading order (LO) in the strong coupling constant. Nominal signal samples were generated with a parity-conserving ℓ vertex and unpolarized leptons. Scenarios where the decays are maximally parity violating were considered by reweighting the simulated events using T S [32], as discussed in Ref. [9]. The → background events were simulated with the S 2.2.1 [33] generator using the NNPDF 3.0 NNLO PDF set [34] and next-to-leading-order (NLO) matrix elements for up to two partons, and LO matrix elements for up to four partons, calculated with the C [35] and O L [36][37][38] libraries. Background → ℓℓ events were simulated using the P -B [39] generator with NLO matrix elements. All MC samples include a detailed simulation of the ATLAS detector with G 4 [40, 41]. As in Ref. [9], the simulation of -boson production is improved through a correction derived from measurements in data. The simulated T spectra of the boson are reweighted to match the unfolded distribution measured by ATLAS in Ref. [42]. The predicted overall yields of signal and → events are determined by a binned maximum-likelihood fit to the combined data in the SR and in a control region enhanced in → events (CRZ ). This eliminates the theoretical uncertainties in the total -boson production cross section ( ), as well as the experimental uncertainties related to the acceptance of the common ℓℓ final state. The selection criteria for events in the CRZ are the same as those for events in the SR, except that events are required to be classified as → -like, i.e. with an output smaller than 0.2 for the → NN and greater than 0.2 for both the top-quark and diboson NNs. In the channel, a small contribution to the total background originates from → events in which one muon is misreconstructed as an electron. Such electron candidates may originate from muons that fail the muon selection requirements and whose tracks are associated with a calorimeter energy cluster and reconstructed as electrons. They may also originate from muons undergoing bremsstrahlung. Such events are modeled with simulation and their predicted yield is based on the measured [43]. The modeling is validated in a dedicated region which has the same selection as the SR except for the inverse selection on track T ( )/ cluster T ( ). Based on the observed level of agreement between data and simulation, a systematic uncertainty of 15% is assigned to the predicted yield of → events in the SR, with no further correction.
Events with fake leptons yield a small but still significant background contribution. In most cases, the fake lepton is the subleading one. These events are estimated from data using a "fake-factor method" similar to the one used in Ref. [9]. The fake factor is defined as the ratio pass−iso fake / fail−iso fake , where "fake" indicates events with at least one fake lepton and "pass-iso" or "fail-iso" indicate whether the subleading lepton passes or fails the isolation requirement. The fake factor is measured in events with pairs of same-sign leptons (SS). These events are enhanced in (→ ℓ )+jets, which is the dominant source of events with fake leptons in the SR. Events in the SS region pass the same event selections as those in the SR except for a same-charge requirement. The fake factors are measured as functions of the transverse momentum and pseudorapidity of the leptons, separately for and events. The kinematic properties of events with fake leptons in the SR or in the CRs are estimated by the distributions of events with the subleading lepton failing the isolation requirement, but otherwise satisfying all other selection criteria for that region, multiplied by the fake factor. The total predicted yields of the events with fake leptons in the SR and CRs are instead determined by a combined maximum-likelihood fit to data, separately for and events. The remaining background processes are estimated using simulations. These backgrounds include events from the production and decay of top quarks [31,39], pairs of gauge bosons [33,34] and the Higgs boson [31,39]. The yield of the events with top quarks is determined in the maximum-likelihood fit to data via the inclusion of a top-quark control region (CRTop). The selection requirements for the CRTop are the same as for the SR except that at least one -tagged jet is required. The expected event yields of the remaining processes are determined based on their production cross section, the integrated luminosity, and the simulated selection efficiency.
A statistical analysis of the selected events is performed to assess the presence of signal events, following the same method used in Ref. [9]. A simultaneous binned maximum-likelihood fit to the combined NN output distribution in the SR, the coll ( , ) distribution in the CRZ , and the event yield in CRTop is used to constrain uncertainties in the predictions and extract evidence of a possible signal. The fit is performed independently for the and channels. The fraction of → events selected in the channel (and vice versa) is negligible and is therefore neglected. In order to improve the discrimination between signal and the events with fake leptons, the events in the SR are further split into two regions based on the transverse momentum of the subleading-T lepton ℓ 1 . The low-T SR contains events with a T (ℓ 1 ) < 20 (25) GeV in the ( ) channel, while the high-T SR contains the events above these thresholds. Both SRs in the channel have comparable sensitivity, while the low-T SR in the channel is more sensitive than the high-T SR. Both SRs are fitted simultaneously. There are four unconstrained parameters in the fits: the parameter of interest determines the LFV branching fraction B ( → ℓ ) by modifying an arbitrary prefit signal yield, determines times the overall acceptance and reconstruction efficiency of the ℓℓ final state in → and signal events, top determines the yield of the top-quark events, and fakes determines the yield of the events with fake leptons. Constrained parameters are also introduced to account for systematic uncertainties in the signal and background predictions, as in Ref. [9]. These include uncertainties in simulated events in the modeling of trigger, reconstruction, identification and isolation efficiencies, as well as energy calibrations and resolutions of reconstructed objects. No systematic uncertainties are assigned to the overall yields of events with -boson decays, fake leptons, or top quarks as these yields are determined from data. Uncertainties related to events with fake leptons include statistical uncertainties due to the size of the data sample used to measure the fake factors Table 2: Summary of the contributions to the uncertainty in the measured B ( → ℓ ℓ ). The uncertainties related to light charged leptons include those in the trigger, reconstruction, identification and isolation efficiencies, as well as energy calibrations. The uncertainties related to jets and miss T include those in the energy calibration and resolution. The uncertainty in the → yield is only applicable in the channel. The total systematic uncertainty can differ from the sum in quadrature of the different contributions due to correlations among uncertainties as a result of the likelihood fit to data. as well as to model their distributions in the SRs and CRs. Systematic uncertainties assigned to events with fake leptons account for: shape differences in the modeling of the combined NN output in the SS events; differences in the composition of the events with fake leptons between SS events and the events in the SRs; and uncertainties affecting the number of events with prompt leptons failing the isolation requirements as estimated by simulation. The dominant uncertainties of the search are statistical in nature. Among the systematic uncertainties, the dominant ones are those in the jet calibration which enter through the calculation of the miss T [24]. A summary of the uncertainties and their impact on the LFV branching fraction is given in Table 2.

Uncertainty in
The observed and best-fit predicted distributions of the combined NN output in the SRs with the highest sensitivity as well as distributions of the collinear mass in the high-T SRs are shown in Fig. 1. The best-fit yield of → ℓ signal corresponds to the branching fractions B ( → ) = [−2.6 ± 3.5 (stat) ± 2.7 (syst)] × 10 −6 and B ( → ) = [−4.4 ± 3.9 (stat) ± 3.4 (syst)] × 10 −6 . The best-fit yields of → , top quarks, and events with fake leptons are close to the prefit predicted values and are determined with a relative precision of 2%-4%, except the events with fake leptons in the channel, which have an uncertainty of 30%. As no significant excess of data over the predicted background is observed, a combined fit of the ℓ ℓ and ℓ had channels is used to set upper limits on B ( → ℓ ). The analysis of the ℓ had channel with Run 2 data [9] uses a similar scheme of regions and unconstrained parameters. In the statistical combination, the parameters of interest are correlated among the different SRs and CRs. The other unconstrained parameters are uncorrelated as these account either for backgrounds specific to each channel or for different acceptances of the ℓ ℓ or ℓ had final states. Common systematic uncertainties are correlated, besides those related to the jet energy calibrations, which are uncorrelated. This conservative correlation scheme was chosen because of different best-fit values for the parameters associated with these uncertainties in the two channels. However, the fit with correlated jet energy    and channels, respectively. The expected signal, normalized to an arbitrary B ( → ℓ ) = 3×10 −4 for visualization purposes, is shown as a dashed histogram in each plot. In the panel below each plot, the ratios of the observed yield (dots) and the best-fit background-plus-signal yield (solid line) to the best-fit background yield are shown. The hatched uncertainty bands represent one standard deviation of the combined statistical and systematic uncertainties. The first and last bins in each plot include underflow and overflow events, respectively. calibration uncertainties yields compatible combined upper limits. The analysis of the ℓ had channel with Run 1 data is combined using the same correlation scheme as in Ref. [9]. The combined best-fit amount of → ℓ signal corresponds to the branching fractions B ( → ) = [−1.4 ± 2.5 (stat) ± 1.8 (syst)] × 10 −6 and B ( → ) = [1.7 ± 2.2 (stat) ± 1.6 (syst)] × 10 −6 . Since no significant deviation from the SM background hypothesis is observed, exclusion limits are set using the CL S method [44]. The upper limits are shown in Table 3 for LFV decays with different assumptions about the -polarization state. The polarization of the lepton affects the energy of its visible decay products and thus the acceptance for signal events. In the scenario where the leptons are unpolarized, the observed upper limits at 95% C.L. on B ( → ) and B ( → ) are 5.0 × 10 −6 and 6.5 × 10 −6 , respectively.
In conclusion, this Letter reports the first analysis of the ℓ ℓ channel in the search for → ℓ decays at the LHC. This channel yields a sensitivity similar to the ℓ had channel. With the combined results of the two channels, the ATLAS experiment sets the most stringent constraints on LFV -boson decays involving leptons to date. The precision of these results is mainly limited by statistical uncertainties.  [17] ATLAS Collaboration, Performance of the