Testing Lepton Flavour Universality in $\Upsilon (4S)$ Decays

We propose a novel method to probe the persistent hints of Lepton Flavour Universality violation observed in semileptonic $B$ decays. Relying on the specific properties of the Belle II experiment, it consists in comparing the inclusive rates of $\Upsilon(4S) \to e^\pm \mu^\mp X$, $\Upsilon(4S) \to \mu^\pm \tau_{\rm had}^\mp X$ and $\Upsilon(4S) \to e^\pm \tau_{\rm had}^\mp X$. We show that such a measurement can be directly related to the ratio $R(X)_{\tau\ell} \equiv \Gamma (b\to X \tau \nu ) / \Gamma (b \to X \ell \nu)$ ($\ell=e$ or $\mu$), once appropriate experimental cuts are applied to suppress the effects of neutral $B$ mixing and leptons emitted through charm or tau decays. Such a measurement would thus constitute an additional and potentially competitive probe of Lepton Flavour Universality in $b\to c\ell\nu$ transitions, complementary to existing exclusive measurements, accessible in the Belle II environment.

In the following we propose a related but potentially more direct test through inclusive di-leptonic Υ(4S) decays by defining as the inclusive dileptonic branching fraction for Υ(4S) decays to a pair of opposite charged leptons of different flavours, where X denotes all the other (hadronic) activity and missing momentum in the event. This fully inclusive measurement exploits several key capabilities of the Belle II experiment as well as some specific features of Υ(4S) and b-hadrons decays. On the experimental side, the excellent beam energy calibration of Super KEK-B can ensure that the Υ(4S) resonance is produced on shell even if its invariant mass is not reconstructed explicitly from the final state. This also allows the non-resonant background to be well estimated from sideband measurements. On the theory side, this inclusive decay is almost entirely saturated by decays into BB final states. Moreover, one can analyse the production of the leptons either from an initial b-quark decay or from subsequent parts of the decay chain in detail. All in all, ratios of the form where , , are three different flavours of leptons e, µ, τ , provide a very interesting ground to probe lepton flavour universality with an inclusive measurement at Belle II 1 , complementary to exclusive measurements accessible to both Belle II and LHCb experiments. In particular, under suitable experimental conditions one can relate where = e, µ and R(X) τ ≡ Γ(B → Xτ ν)/Γ(B → X ν) is the inclusive B decay LFU ratio, which can be precisely computed in the SM as R(X) τ = 0.223(4) [21]. The dots denote corrections due to neutral B meson mixing effects and charm pollution. In the following we discuss both effects and estimate the accuracy with which this ratio can be measured and compared with the SM expectation, in order to extract potential violations of LFU.

B. Lepton production
Having established that Υ(4S) decays almost only into pairs of B mesons, we consider their subsequent decays inclusively, focusing on final states containing leptons ( ) = e, µ, τ in the final state. We can differentiate between several measurable inclusive dilepton signatures such as where τ had denotes a τ lepton reconstructed from its hadronic decays (e.g. τ → 3πν) 2 . We will thus define To relate these ratios to inclusive B-decay LFU ratios, we need to isolate contributions where each of the two leptons is produced in a separate B-meson decay and suppress backgrounds where one or both leptons do not originate from a direct semileptonic B decay. Requiring different opposite-sign lepton flavour final states removes such contributions from Υ(4S) → X + ((bb) → + − ), b → q((cc) → + − ) as well as from rare FCNC (semileptonic) B and charm decays. 3 This approach is however not effective against contamination from b → q(c → q + ν)(c → q − ν) and b → (c → q + ν) − ν) transitions, which we will address in Sec. II E. For the moment we assume that each of the two different lepton tags originates from a separate B-meson decay pattern. We will focus on R Υ(4S) τ had µ for the time being, but a very similar analysis can be performed for R τ had e swapping muons and electrons in the discussion. The single hadronic tau can be produced in the quark-level transition chains On the other hand a single muon (or equivalently electron) can originate from Inclusive semileptonic b-hadron decays (i.e. b → q ν) are well under theoretical control and thus the associated rates can be well predicted, including possible effects of LFU violation [21]. The same cannot necessarily be said for inclusive semileptonic charm decays [26] , which thus represent a challenging background. One could imagine that the charge of the leptons could help us to disentangle the origin of the lepton, either from a b or from a c-quark. However, one should take into account that in approximately half of the cases, the Υ(4S) decays into neutral B mesons, which can oscillate and spoil the identification between the charge of initial quark and that of the lepton. We discuss strategies how to mitigate this effect next.

C. Mixing effects
We first define the amplitudes embedding the complete meson decay chains (for instance it may contain B → Dπ followed by D → X), so that the lepton is not necessarily produced by the decay of the b-quark. However, it is not produced by the decay of the light quark in the B, which means that in the isospin limit, we have equalities of the type: where the presence/absence of the bar indicates the charge of the b-quark inside the B-meson and the subscript denotes the charge and flavour of the lepton. If we look for Υ(4S) → 1 2 X (with 1 and 2 being different, either by flavour or charge) through an intermediate B 0B0 state, we can use the description introduced for the study of CP violation from the production of an intricated B-meson pair (sec 1.2.3 in Ref. [27]), leading to the time-dependent rate where one of the two B mesons decay into a state containing 1 at a time t 1 and the other one into a state containing 2 at a time t 2 , leading to where C is a normalisation coming from angular integration, ∆m is the difference of mass between the two mass eigenstates, Γ is their average width, the approximations |q/p| = 1 and ∆Γ = 0 have been used, and we have In order to prevent too large effects from mixing, one could consider cutting too large time differences |t 1 − t 2 |, so that there has not been enough time for the evolution to take place. Cutting at |t 1 − t 2 | = α/∆m (where α is the cut parameter) leads to where x = ∆m/Γ 0.769 [1]. In the case of B + B − , where no mixing is involved, we have Denoting the result without cut in the time difference as R BB ≡ R ∞ BB , we see then that the effect of mixing corresponds to whereas we have In Fig. 1 where the first term corresponds to the rate without mixing (in the isospin limit) and the second term to the contamination due to mixing. Cutting |t 1 − t 2 | ≤ α/∆m, we have The first term in Eq. (21) goes like O(α) whereas the second term goes like O(α 3 ). Moreover, so it goes like O(x 2 ) and it is isospin suppressed. Then the second term in Eq. (21), corresponding to the mixing effects, is suppressed significantly. From Fig. 1, we see that α = 0.53 would ensure that the second contribution is O(1%) of R B + B − , taking into account the suppressions by the α-dependent factor, by x 2 /(1 + x 2 ) and by 1 − ρ (but not taking into account the isospin suppression, which would further suppress this term). On the other hand, the first contribution in Eq. (21) would be half of R B + B − (essentially R without the effect of mixing in the isospin limit). More generally, a fit to R α as a function of α would allow one to put a bound on (R B 0B0 − R B + B − ) and to extract R B + B − directly.

D. Charm pollution
As shown in the previous section, cutting on the time difference of the two decaying B mesons can suppress mixing effects and allow one to distinguish leptons originating from B and charm decays by charge. However, we still need to quantify the expected initial amount of charm contamination. Under the assumption that each tagged lepton originates from a separate B decay chain (which we will relax in the next section), the ratio R Υ(4S) τ had µ can be conveniently expressed in terms of where h c denotes any weakly decaying charmed hadron, i.e. D + , D 0 , D s , Λ c and their charge conjugates. As discussed above, using charge ID, but also possibly a cut on leptons not originating from the secondary vertex (i.e. from B decays), it should be possible to suppress contributions where the leptons originate from secondary charm or, in the case of muons, tau decays, by efficiency factors (i) 1. This allows us to simplify the above expression and write it in terms of the inverse of the inclusive ratio R(X) τ µ . We obtain where µ b denotes muons consistent with originating from the secondary (i.e. b-decay) vertex and we have already used the fact that B(B → Xτ ν) B(B → X(h c → X τ ν)) , which we verify below. We can estimate the size of all three corrections on the right-hand side of Eq. (24) (up to the (i) efficiencies) based almost purely on experimental information. Starting with the (1) term, B(τ → µνν) = (17.39 ± 0.04)% [22] we see, that even without cuts (for (1) 1) it leads to an order 4% (computable) systematic effect in R(X) τ µ .
We estimate the second and third term thanks to the identity where the sum runs over all weakly decaying charmed hadrons and f (c → h (i) c ) are the corresponding fragmentation functions. We use Ref. [28] for the charm-inclusive decay branching ratio B(B → X c ) = (97 ± 4)% and Ref. [29] for the charm fragmentation functions. Note that the above estimate relies on factorization of the inclusive Bdecay amplitudes and is thus subject to related theoretical uncertainties. In addition, the application of charm fragmentation functions extracted from high energy e + e − and ep collision data to B decays carries further systematic errors. Consequently, our background evaluations should be taken as order-of-magnitude estimates, which are however sufficient for our purpose. For B(D s → Xµν) we use values measured by CLEO for the electron in the final state [30] B(D s → Xeν) = (6.52 ± 0.39 ± 0.15)% which can serve as an effective upper bound on B(D s → Xµν) assuming e − µ LFU in charm decays. We also use B(D + → Xeν) = 0.1607 ± 0.0030 and B(D 0 → Xeν) = 0.0649 ± 0.0011 [22]. Finally, we obtain These values are to be compared with the LEP experimental determination of B(b → qτ ν) B(B → Xτ ν) = (2.41 ± 0.23)% [21]. In particular, before cuts and without lepton charge ID (for (2) 1) the second term in Eq. (24) would represent a dominant 80% systematic effect in the determination of R(X) τ µ . Finally, the effect of the (3) term before cuts (for (3) 1) represents a relative 28% systematic effect on the determination of R(X) τ µ .
In summary, the term with (1) is small thanks to the low value of B(τ → µνν), whereas the factors of (2) and (3) have large values but are related to charm pollution, which (hopefully) can be reduced thanks to charge ID leading to small efficiencies (2,3) .

E. Leptons emitted from the same B-meson
Lastly we need to consider backgrounds where both leptons are of different charge and flavour, but originate from the same B-decay chain, corresponding to the parton-level chain Denoting these processes collectively as B → X , and assuming they can be suppressed by cutting on leptons not originating from the secondary vertex (i.e. from b decays), we can again write the relative correction to Eq. (24) due to these contributions expanded to leading order in all (i) as where the inclusive hadronic B-decay branching ratio is denoted as B(B → X) 1 − B(B → X ν) 0.76, we take B(B → X c eν) B(B → X c µν) 0.11 [22], and the ellipsis denotes the remaining corrections on the right-hand side of Eq. (24). Using the numerical values given above we obtain for the relevant b → (c → q + ν) − ν transitions Finally, for the decay chain b → qcc(c → q ν)(c → q ν), using B(B → X cc ) 22% [31] and after including c → q ν andc → q ν transition rates, we find Putting these values together we observe that these backgrounds are individually comparable in size to the signal (i.e. they would represent approximately 80% and 150% relative corrections, respectively) in absence of cuts to suppress them (for (4,5) 1). While they are similar in magnitude, they are highly correlated and contribute with opposite signs, so that they tend to cancel to a degree for (4) (5) . In fact, the two terms become exactly equal in the limit where one can neglect charm decays to muons and taus (in the ratio R Υ(4S) τ e these would be charm decays to electrons and taus). On the other hand, contrary to the corrections outlined in Eq. (24), the corrections considered in this section cannot be suppressed using only lepton charge ID. This highlights the crucial importance of discriminating against leptons originating from the same B-decay chain, for instance through geometrical considerations. An alternative strategy could consist in discriminating leptons arising from the secondary (B-decay) vertices from those arising further down in the decay chains. A quantitative assessment of the feasibility of either of the two approaches through an appropriate experimental analysis would require a dedicated experimental study and is beyond the scope of this work.

III. CONCLUSIONS
Relying on the specific properties of B-factories and in particular the Belle II experiment, we have proposed to compare the inclusive rates of Υ(4S) → e ± µ ∓ X, Υ(4S) → µ ± τ ∓ had X and Υ(4S) → e ± τ ∓ had X. This measurement can be related to the ratio R(X) τ ≡ Γ(b → Xτ ν)/Γ(b → X ν) ( = e or µ), once appropriate experimental cuts are applied to suppress the effects of neutral B mixing and leptons emitted from rare FCNC (semileptonic) B decays, as well as secondary charmonium, charm and tau decays. The feasibility of our proposal crucially assumes that hadronically decaying tau leptons originating from the B decay vertices can be efficiently disentangled from backgrounds (e.g. from hadronic B decays involving three or more charged pions) at Belle II. A dedicated experimental study of this is however beyond the scope of the present paper.
We have focused on the case of R Υ(4S) τ µ R(X) τ µ , but our discussion applies equally well to the tau-electron combination, swapping the roles played by electrons and muons. The current deviations in B → D * ν and B → D ν when τ channels are compared to electronic or muonic modes are at the level of 10% (for the LFU ratios of branching ratios) and provide a benchmark for the target sensitivity of our proposal. This is illustrated by the very simple case where NP mimics the V − A structure of b → cτ ν currents in the SM, leading to a universal rescaling of all b → cτ ν branching ratios.
Given our estimates, the systematic uncertainties in the determination of R(X) τ from a measurement of R Υ(4S) τ could be brought below a given value ( sys ) provided that (1) cuts on the B −B impact parameter difference can suppress the neutral B meson mixing effects below sys combined with an efficient lepton charge ID to suppress semileptonic charm-decay contamination; (2a) multiple leptons originating from the same B decay chain can be suppressed to better than sys or alternatively (2b) leptons arising from the secondary (B-decay) vertices can be discriminated against those arising further down in the decay chains to roughly better than sys . Further dedicated experimental studies are needed to establish the actually attainable precision by Belle II.
In summary, we have proposed a novel method to test the persistent hints of violation of LFU observed in semileptonic B decays. This measurement would constitute an additional and potentially competitive probe of LFU violations in b → c ν transitions, complementary to exclusive measurements and accessible in the Belle II environment.