Tests of the Standard Model in $B \to D\ell \nu_\ell$, $B \to D^* \ell \nu_\ell$ and $B_c \to J/\psi \, \ell \nu_\ell$

A number of recent experimental measurements suggest the possibility of a breakdown of lepton ($\ell$) universality in exclusive $b \to c \ell \nu_\ell$ semileptonic meson decays. We analyze the full differential decay rates for several such processes, and show how to extract combinations of the underlying helicity amplitudes that are completely independent of $m_\ell$. Ratios of these combinations for different $\ell$ (as well as some combinations for a single value of $\ell$) therefore equal unity in the Standard Model and provide stringent tests of lepton universality. Furthermore, the extractions assume the form of weighted integrals over the differential decay rates and therefore are useful even in situations where data in some regions of allowed phase space may be sparse.


I. INTRODUCTION
The Standard Model (SM) has historically worked extremely well, but many compelling reasons lead one to expect the existence of beyond-Standard Model (BSM) physics. Besides gravity, neutrino oscillation is the only confirmed BSM physics, and certainly provides significant information. But it is important to seek out additional regimes in which the SM fails, both for its own discovery potential and to test our understanding of processes that have traditionally been well understood in the SM.
Of course, this tension could be due to statistical fluc- * cohen@physics.umd.edu † hlamm@umd.edu ‡ richard.lebed@asu.edu tuations and/or some subtle systematic experimental bias. If, however, these results are early signals of BSM physics, then a natural explanation could be a breakdown of lepton universality, i.e., some process by which the τ and ν τ couple to the decaying B or B c meson differently than do a µ and ν µ . Accordingly, it is useful to construct more experimental tests of lepton universality, beyond just R(H).
The value of such tests lies in their utility to isolate where the apparent violation of the SM arises.
In principle, obtaining more sensitive tests is straightforward. B-meson decays depend upon the 4-momentum and spin state of and the decay products of the final hadrons. The process is thus characterized by a differential decay rate expressed in terms of many variables (angles, momentum transfers, etc.). In the absence of BSM physics, the entire differential decay rate is predicted by the SM. If these predictions are known with sufficient precision, a direct comparison to the τ and µ rates from experimental data serves as a test of the SM, allowing one to see precisely where the SM breaks down.
There are, however, two major practical difficulties in implementing such a scheme. The first is the requirement of a full prediction from the SM. While to good approximation one can ignore higher-order electroweak effects in semileptonic decays, a SM prediction requires knowledge of several transition form factors of the B (B c ) to the D ( * ) (J/ψ). These form factors involve strong interactions, preventing perturbative calculations, but they are amendable to lattice QCD. At present, only the B → D form factors have been computed with a complete treatment of uncertainties [11,12]. Partial results exist for B → D * [19][20][21][22][23] and B + c → J/ψ [16], but do not cover the entire allowed range of momentum transfer or have control of their systematics. Even with these limited results, combined constraints on R(H) can be made by application of dispersive relations and heavy quark symmetries [17,24].
While ignorance of the form factors yields a degree of uncertainty in the prediction of R(H), the estimates of these uncertainties have relatively mild consequences for this ratio-provided the form-factor determinations can be trusted. The same cannot be said of the differential decay rates, with all of their parametric dependences.
With sufficient data, one might hope to extract the form factors directly and then check for self-consistency with the SM. For example, one could extract the form factors from the µ channel and then use these to predict the differential decay rate for the τ channel. A comparison of the predicted differential decay rate with the experimental one would then probe the SM. However, this approach is difficult because it requires a considerable amount of reliable data to implement. To be successful, one would need to extract the differential decay rate above experimental background with reasonable accuracy over all allowed ranges of all kinematic variables.
In this paper we propose a number of tests of the SM that are particularly sensitive to lepton universality violations in b → c semileptonic meson decays. These tests directly probe lepton universality, while having the virtue of being form-factor independent. Moreover, it is likely that some of the proposed tests can be implemented with relatively sparse data. The basic method is to consider the ratio of the τ to µ channels of particular weighted integrals of the differential decay rates. These ratios equal unity in the SM (up to subleading electroweak corrections), and their deviation from unity constitutes a measure of the violation of lepton universality. The robustness of these tests lies in the choice of weight functions: Although the hadronic form factors may be unknown, their momentum transfer (q 2 ) dependence is identical for the τ and µ channels.
The tests probe universality for the following basic reason: In the SM these decays are dominated by the decay of the B (B c ) meson into a D ( * ) (J/ψ) via the emission of a virtual W , which subsequently decays into the charged lepton and neutrino ν . The processes in which the final lepton is a τ or µ are distinguished only by the kinematics associated with the different m . However, these kinematical differences lead to different weightings of the various form factors, even at the same value of q 2 . If instead, one takes special kinematically weighted averages over the differential decay rates, then lepton universality of the SM requires that these averages are equal.
In addition to testing for violations of lepton universality, we construct other SM tests that do not require knowledge of the form factors. These tests are ratios of weighted integrals of the differential decay rate, but can be performed using a single type of lepton .
This work is by no means the first attempt to overcome the difficulties of extracting useful information from the full differential decay rates. Prior works [25][26][27][28][29][30][31] with different aims (e.g., to study the effect of form-factor parameterizations, generalized BSM studies, and effects of the polarization of the D * ) have tackled similar problems. In particular, the use of helicity amplitudes (which are particular linear combinations of form factors) are employed in many of these works, as well as in the present paper. Moreover, the "trigonometric moments" of Ref. [26] are closely related but not identical to the weight integrals used here. This paper is organized as follows: Sec. II describes a set of possible experimental tests of lepton universality and other aspects of the SM for B → D ν and B → D * ν . The derivation of these tests depends upon the connection of the differential decay rate to the helicity amplitudes, which are described in detail in Sec. III. Section IV contains the tests for violations of lepton universality in B c → J/ψ ν and their derivation in terms of helicity amplitudes. Section V contains closing remarks.

II. STANDARD MODEL TESTS IN
Consider first the semileptonic decay process P → V ν , where P is a pseudoscalar meson decaying to a vector meson V , which subsequently decays into a pseudoscalar meson pair P 1 P 2 (e.g., B → D * ν , D * → Dπ). The differential rate for such decays depends upon the momentum transfer q 2 to the ν pair and three angles: θ V , the polar angle characterizing the direction of P 1 (measured in the V rest frame) with respect to the direction of V (measured in the P rest frame); θ , the polar angle characterizing the direction of the lepton (measured in the W * [virtual W ] rest frame) with respect to the direction of W * (measured in the P rest frame); and χ, the azimuthal angle between the V P 1 P 2 plane and the W * ν plane. The angles are shown in Fig. 1, and agree with those defined in Ref. [32]. A detailed description of how these angles compare with other conventions in the literature appears in the following section.
Angle conventions for semileptonic decays of the form P → V ν , V → P1P2, where P is a pseudoscalar meson, V is a vector meson, and P1, P2 (˜ − ,˜ + ) are decay products of V . In the first relevant case described in the text, the decay chain is B → D * ν , D * → Dπ. In the case B → J/ψ ν , the labels V →˜ −˜ + represent J/ψ → µ − µ + .
One defines the full 4-fold differential decay rate for this process: dΓ(P →V ν , V→P1P2) dq 2 dcos θ V dcos θ dχ . We frequently integrate over the three distinct angles, and therefore introduce the collective symbol and define the integral measure over X V and the full derivative with respect to X V as respectively. Thus, the full differential cross section dΓ(P →V ν , V→P1P2) dq 2 dcos θ V dcos θ dχ can be denoted by dΓ V dq 2 dX V , and the total cross section by where q 2 is integrated over all kinematically allowed momentum transfers, from the hadronic maximum recoil point q 2 = m 2 (at which the is produced at rest in the W * rest frame) to the hadronic zero-recoil point q 2 = (M P − M V ) 2 (at which the V is produced at rest in the P rest frame). Alternatively, consider a process in which the finalstate hadron is a weakly decaying pseudoscalar P (e.g., B → D ν . The kinematics is simpler because the P is a (pseudo)scalar without strong decay modes. The kinematical variables are similar to those above (upon substituting V → P ), but only the angle θ remains, and the full differential decay rate is given by dΓ(P →P ν ) For later compactness, let us define X P ≡ {cos θ }. The total cross section is then One can trivially generalize Eqs. (3)-(4) to weighted cross sections Γ H ,i , where H = V, P , by integrating with a weight function W i (q 2 , m 2 l , X H l ): (5) Note that the q 2 bounds include only the allowable kinematic regime for τ decays, independent of the lepton channel considered. By excluding the range m 2 µ ≤ q 2 < m 2 τ , one ensures that the same range of phase space is sampled in all channels.
With these definitions, one can construct ratios from different combinations of and W i . The simplest of these, R H i , are generalizations of the standard R(H): Note that q 2 ≥ m 2 τ means the R H i with W i = 1 are not the ratios R(H) typically used in the literature, which are instead defined as ratios of the full decay widths to these lepton channels.
One has considerable freedom in choosing W i , but not all choices are useful. For our purpose of removing formfactor and leptonic-mass dependences, we initially restrict to forms in which q 2 and m 2 only appear in the ratio which always obeys ε ≤ 1 in the allowed range for q 2 .
While ε strictly depends upon m , we forgo an index on ε unless confusion would arise. For decays P → P (e.g., B → D ν ), one finds three that remove the form-factor dependences (their derivation appears below, in Sec. III, and they can be recognized in Table II): Similarly, for decays to V (e.g., B → D * ν , D * → Dπ), we construct form-factor independent SM tests by choosing W i m 2 , q 2 , X V = W i ε, X V to be any of the eight forms (cf. Table I): With these choices of W i , by construction the SM predicts that the ratios defined in Eq. (6) satisfy where O(α) indicates leading-order electroweak corrections not included in our analysis, the same level currently neglected in R(H) calculations. The prediction of Eq. (10) for each i can be viewed as a test of lepton universality: Universality violations imply R(H) generically differs from unity. At this stage, the angular and ε factors appearing in Eqs. (8)-(9) seem quite arbitrary, and it may seem unclear how they remove the form-factor dependence or should yields R h i = 1 in the SM. In fact, the reason for both is quite simple. In Sec. III, the differential cross sections are written in terms of helicity amplitudes (which are linear combinations of the transition form factors). It is shown below that, when any W i given above is integrated over the differential cross sections, one obtains a particular quadratic form of the helicity amplitudes, for example: where H + and H − are two helicity amplitudes defined in Sec. III, and G 0 is a combination of overall fundamental constants and known functions of q 2 (but not m 2 ). 1 Furthermore, the W i are designed to remove the kinematic dependences on ε such that, for fixed q 2 , the weighted differential cross section after angular integration depends upon a fixed combination of helicity amplitudes, independent of lepton flavor. Therefore, Γ V ,i are integrals only of these special combinations, so that, e.g., Eq. (11) yields which is manifestly unity in the SM, regardless of whether one can determine the helicity amplitudes. While one could compare different lepton channels at the weighted differential cross-section level, such analysis may be difficult because the data are sparse in some bins, or the experimental analysis may not be straightforward for extracting them. Instead, by integrating in q 2 , one can perform these calculations on any data set that can produce R(H), with improved sampling statistics and reduced background for realistic experimental situations.
One should note that while W a , W c , W 1 , W 2 , W 7 , and W 8 depend upon helicity-amplitude combinations appearing in the total decay rates [see Eqs. (26) and (32)], W b and W 3−6 do not. Therefore, to explain the existing R(H) tensions with BSM physics, these weights are particularly important for the immediate analysis. But tests based upon W b and W 3−6 are interesting in their own right, as they probe other aspects of possible SM violations. These tests can also be applied to B (b → c −ν ) decays, using precisely the same W i except for an overall sign change in W 3 ; but this sign is innocuous in R H i . One is not restricted just to the weight functions W a,b,c and W 1−8 discussed above. Clearly, any (possibly q 2dependent) linear combinations of W a,b,c or W 1−8 also yield valid weight functions W for which the SM predictions of Eq. (10) hold: where j is the set of allowed weight functions for the H decay channel, either a, b, c for H = P , or 1 − 8 for H = V , and f j (q 2 ) are functions of q 2 that are independent of lepton flavor. One would be mistaken to presume these linear combinations provide no new information. First, the functions f can be chosen to emphasize different q 2 regions, as opposed to using an unweighted q 2 integral. When using experimental results, it may be advantageous to choose f to reduce the experimental uncertainties in the ratios by choosing linear combinations of weight functions or their coefficients in Eq. (13) that minimize the contribution from kinematical regions with larger uncertainties, e.g., close to the q 2 minimum value of m 2 τ . Second, even for f constant, the ratio of averages using Eq. (13) would include terms containing ratios of the form W j /W k where j = k, which are absent from ratios containing a single weight function. In short, the ratio of sums differs from the sum of ratios.
It is straightforward to test these relations experimentally. Consider an idealized experimental situation: One has an arbitrarily large amount of data in a complete set of N H decay events, of which N H are semileptonic decay events in the = µ, τ channels; the momentum transfer and the angles are measured to arbitrary accuracy; and for each such event j with precisely determined kinematics, one can determine two probabilities to arbitrary accuracy: the probability P b j that an event with kinematics j, which has been identified as a possible P → H decay, is actually a background event (rather than being a true decay, which has probabilityP b j = 1−P b j ), and the probability P d j is measured and correctly identified (i.e., the total efficiency for detection and identification is known).
In such a case, the statistical average of ratios R H i can be determined experimentally by where the brackets indicate a statistical average for the quantity, and the index j (j ) indicates a particular decay event in the = τ ( = µ) channel. Θ denotes a Heaviside step function that ensures the sums cover the same kinematic region in q 2 . Equation (14) represents a pure counting experiment: Since the events in both the numerator and denominator are sampled probabilistically, they effectively map out the τ and µ differential decay-width distributions; by weighting each event with the appropriate function W i , one develops an approximation to the relations of Eq. (10).
A few comments about the experimental implementation of Eq. (14) is in order. First, one can in principle obtain reliable estimates of R H i (for at least some choices of the weight functions W i ) with far less data than is needed to extract the form factors. In particular, one does not need the full angular dependence of the data at identical values of q 2 to obtain well-converged sums in Eq. (14). In this sense, the situation is similar to the extraction of R(H) in Refs. [2][3][4][5][6][7][8][9]18].
Second, while theoretically R H i do not depend upon knowledge of the form factors, the experimental extractions of the ratios can depend upon the form factors, to the extent that they are used in the determination of P b j,j and P d j,j (which is a potential major concern, as the experimental uncertainty on R(J/Ψ) is dominated by form-factor uncertainties used to discriminate backgrounds [9]).
Third, throughout our analysis we assume that the τ can be fully reconstructed. In practice, such detailed information might not be accessible, in which case one could either generalize the technique presented here by including the angular dependences from the τ decay products, or restrict to a set of W i that can be reliably extracted. The latter approach is considered in Ref. [33], where the authors study the restricted set of useful observables when only limited information can be extracted from the final states of τ decays.
Fourth, in principle an infinite number of R H i exist, due to the arbitrary linear combinations and coefficient q 2 dependences allowed by Eq. (13). One thus obtains an infinite number of tests of the SM. One can exploit this freedom in two complementary ways. First, if one believes that the discrepancies are hints of a particular BSM model, one can choose W i to maximize sensitivity to those particular violations. Alternately, one may exploit the freedom in choosing W i to reduce the experimental uncertainties by choosing linear combinations in Eq. (13) that minimize the contribution from kinematical regions with larger uncertainties, e.g., the limit ε → 1 (q 2 → m 2 ) where fewer events should occur, and therefore which are very sensitive to statistical fluctuations.
In this context, it is worth noting that all W i have a coefficient as ε → 1 (q 2 → m 2 ) at least as singular as (1 − ε) −2 , which compensates for a factor of (1 − ε) 2 in the total cross section arising from phase space and helicity suppression constraints. In Eq. (6), these factors cancel and yield finite results. However, in an experimental situation, the data in this region can become particularly sensitive to statistical fluctuations since there should be fewer events in the τ channel. 2 To remove this sensitivity, one may exploit the freedom in choosing the functions 2 Owing to the cutoff q 2 ≥ m 2 τ in Eq. (6), the factor (1−ε) −2 in the µ channel is always within 1% of unity. f in Eq. (13) to ensure that they go to zero as q 2 → m 2 τ , and thereby suppress large fluctuations. This freedom is particularly important for W a , W 2 , W 4 , and W 5 , which scale as (1−ε) −3 .
Similarly, W b , W c , W 6 , W 7 , and W 8 contain overall factors of 1/ε. For the µ channel, this factor is always quite large-at least 280. These factors arise in helicitysuppressed helicity amplitudes in the differential cross section. It will therefore likely be difficult to extract these amplitudes accurately, since statistical or systematic errors can swamp the data. Thus, the most robust tests of the SM avoid reliance on these W i . However, BSM models could enhance these amplitudes such that deviations from the SM predictions might be large enough to tease out using linear combinations containing these weight functions.
We identify another class of SM tests for P → V that is not sensitive to violations of lepton universality, but rather probes other aspects of the SM while remaining independent of the form factors. This class of test also depends upon ratios of two weight functions, but only a single lepton flavor. These tests reflect the nature of the weight functions W 1 and W 8 , which have two distinct angular dependences, and yet yield the same the helicity amplitude combinations as in Eq. (11): with the weight functions W n,d defined by where h(q 2 ) and φ i (q 2 ) are specified functions of q 2 . The SM prediction is again R V ,nd = 1 + O(α) for both = µ, τ , and any choice of h(q 2 ), φ n (q 2 ), and φ d (q 2 ).
Since this test depends upon W 8 , which has a coefficient 1/ε that is large over much of the kinematic region, a useful test will likely select functions φ(q 2 ) that deemphasize the region where ε is especially small. Note that, since the ratios R V ,nd refer to a single species of lepton , the integrations in both the numerator and denominator extend to ε = 1, unlike R H i , which is restricted to q 2 ≥ m 2 τ . Having shown how to construct tests of the SM from the weight functions W i , in the next section we demonstrate how these W i arise naturally in association with the helicity amplitudes appearing in the decay rates.

A. The Decays P → V ν , V → P1P2
The form factors for the transition of a pseudoscalar meson P (mass M , momentum p) to a vector meson V (mass m, momentum p , polarization vector ) are defined as [34] where the momentum transfer is given by q 2 ≡ (p − p ) 2 . The first calculations of the complete differential decay rates of the semileptonic process P → V ν, V → P 1 P 2 including finite charged-lepton mass effects appeared in Refs. [35,36]. The helicity amplitudes defined in the classic review Ref. [32] and still commonly used (e.g., by the Belle Collaboration [37]) are given by Here, p V is the momentum magnitude of the V (or virtual W ) in the center-of-momentum (c.m.) frame of P : The subscript on H gives the W * helicity: ±1 and 0 for J W * = 1, t (timelike) for J W * = 0. The superscript KS indicates the notation of Ref. [35], 3 and the combinations F 1,2 are those defined in Ref. [34]. The precise number of independent helicity amplitudes for semileptonic processes is most easily computed by considering the crossed process with all hadrons in the initial state and all leptons in the final state, and then imposing assumed conservation laws (e.g., CP conservation) on the system [38,39]. The full 4-fold differential cross section for the semilep- where q 2 is the momentum transfer (or equivalently, the invariant squared mass of the W * ), and η = ±1 corresponds to processes with lepton pairs −ν and + ν , respectively (i.e., twice the neutrino helicity). This expression is equivalent to Eq. (22) in [35] if one replaces θ KS = π −θ . In a conventional calculation, the angular factors emerge from choosing a helicity basis of polarization vectors for V and W for W * , and the lepton 4-momenta p and p ν . More generally, they are Wigner rotation matrices connecting various helicity states; adapt-3 Although Ref. [32] does not define Ht, it is natural to extrapolate from Ref. [35], using the same relative sign as for H ±,0 .
ing from Ref. [40], one may write Unlike in Ref. [40], the V spin in this expression is fixed to 1; and the W * spin J is no longer limited just to 1, but is also allowed to assume the (J = 0) timelike polarization µ W = q µ / q 2 . When q µ = p µ + p µ ν is contracted with the lepton bilinear, e.g.,ū(p )γ µ v L (p ν ) orv R (p ν )γ µ u(p ) in the case η = +1, use of the Dirac equation produces an overall coefficient of m / q 2 in the amplitude. The total lepton helicity κ in the W * rest frame is given by κ = λ + η/2 and equals η for the spin non-flip transition (right-handedν and left-handed − for η = +1, left-handed ν and right-handed + for η = −1) and 0 for the spin-flip transition (opposite helicities for ). The spin non-flip transition gives the leading-order amplitude in the V −A theory, which in the W * rest frame gives a contribution to the rate proportional to 2p (E +p ) = q 2 − m 2 , while the spin-flip contribution is proportional to 2p (E −p ) = (q 2 − m 2 )(m 2 /q 2 ). The lepton mass parameter ε thus appears in four places in the differential rate: (i) in the quasitwo-body phase space factor p ∝ q 2 − m 2 in W * → ν; (ii) in the factor p common to both spin non-flip and spin-flip transitions in V −A theory; (iii) in the additional suppression of spin-flip transitions in the V −A theory; and (iv) in the coupling of a timelike W * in any vectorlike theory. A pedagogical review of these points appears in Ref. [41].
The amplitudes H J λ,κ in Eq. (21) incorporate the nonperturbative physics in terms of helicity amplitudes (and ultimately, form factors), while the Wigner rotation matrices D J m ,m (α, β, γ) = e −im α d J m ,m (β)e −imγ encapsulate all the nontrivial angular correlations. Only one azimuthal angle χ is required to describe the decay, which is that of the D * → Dπ decay plane with respect to the W * → ν decay plane (Fig. 1). The factor (−1) J represents the sign difference in the norm between timelike and spacelike W * polarizations. The sums are further restricted by the factor d J λ,κ when J = 0 to have λ = κ = 0. Lastly, note the great simplification due to the decay of the spin-1 V to spinless particles P 1,2 : Only the matrices d 1 λ,0 are needed to describe the angular dependence for that subprocess.
The precise definitions of the angles are depicted in Fig. 1 and agree with those in Ref. [32]: Starting with the rest frame of the spinless P , the V -W * decay axis is identified with the z-axis, i.e., p V = +ẑ. Then the helicity λ ≡ λ V = λ W * . Boosting into the W * rest frame, one finds the and ν back-to-back, and defines θ as the polar angle of with respect to the W * direction as measured in the P rest frame. Similarly, boosting into the V rest frame, one finds P 1 and P 2 back-to-back, and defines θ V as the polar angle of P 1 (which we take as the heavier of P 1,2 , such as D in D * → Dπ) with respect to the V direction as measured in the P rest frame. Finally, we take χ as the azimuthal angle of the V P 1 P 2 plane with respect to the W * ν plane; to be precise, Refs. [32,37] actually exhibit χ as the clockwise rotation of the V P 1 P 2 plane with respect to the W * ν plane, as viewed with respect to the axis p V = +ẑ, which explains the relative sign of the phase in Eq. (21) compared to that in the conventional notation given above. 4 4 Strictly speaking, this χ differs from the one (χ KS ) used in Once the amplitudes H 1 λ,|κ|=1 = H λ , H 1 λ,0 = ε/2H λ , and H 0 0,0 = 3ε/2H t are inserted and all CP-violating terms (those proportional to the imaginary parts of interference terms, Im H i H * j , and hence proportional to sin χ) are neglected, one obtains Eq. (20). Retaining CP violation modifies Eq. (20) in such a way that, for each term of the form cos(nχ) Re H i H * j , where n = 1 or 2 and i = j, one introduces an additional term of the form ± sin(nχ) Re H i H * j , in which the sign depends upon the particular amplitudes H i,j . Such effects appear in the analysis of Ref. [40] and are relevant to studies such as in Ref. [42].
The question now becomes whether one can extract independently the helicity amplitude combination Re H i H * j from each term in Eq. (20), and indeed, since most of the ε-suppressed terms also carry distinct angular dependence, the combinations εRe H i H * j as well. Of the 15 such terms in Eq. (20), some are clearly linearly dependent: For example, there is no way to extract the difference between ε|H + | 2 and ε|H − | 2 , nor Re H + H * − independently of εRe H + H * − . This linear dependence arises partly through the restrictive form of the V −A interaction and partly through the simplicity of the helicity structures appearing in V → P 1 P 2 . As for the remaining terms, one might think to use the orthonormality of D matrices, first reducing pairs of the matrices via the Clebsch-Gordan series While this method identifies the linearly dependent terms, a much simpler approach is available for Eq. (20): By inspection, one first separates terms with χ dependence into the sets 1, cos χ, and cos 2χ, which are clearly independent by Fourier analysis. Of these, the cos 2χ term in Eq. (20) is unique, while the only independent structures multiplying cos χ are clearly sin θ sin 2θ V and sin 2θ sin 2θ V . Of the χ-independent terms, the independent θ structures are cos θ , cos 2 θ , and sin 2 θ . The corresponding independent θ V structures can always be reduced to the set cos 2 θ V and sin 2 θ V , so that Eq. (20) contains 6 linearly independent χ-independent terms. In total, exactly 9 structures in Eq. (20) are independent. One can further extract the coefficient of each angular structure using orthogonality almost by inspection: For example, a term proportional to sin θ sin 2θ V cos χ is most easily separated from all other structures present simply Ref. [35] by χ = −χ KS . Furthermore, a reanalysis of χ Dey used in Ref. [40] shows that χ = π+χ Dey : To obtain Eq. (20), the factor e iλχ in Eq. (21) must be replaced with e iλ(π+χ) . by integrating with the weight function (23) Defining an overall differential width coefficient, which is 64π/9 times the coefficient in the first line of Eq. (20), one extracts helicity amplitude combinations by performing the integrals the required weight functions w 0 (θ , θ V , χ) and the 9 independent simple combinations of helicity amplitudes that can be extracted are listed in Table I. The full differential width dΓ/dq 2 is of course obtained simply by setting w 0 = 1, and reads dΓ The results of this analysis identify several interesting features: First, the squared amplitudes |H ± | 2 are the only ones that can be extracted independently of the lepton mass correction ε; indeed, H t is always accompanied by a factor ε, and its mixing with H 0 prevents an ε-independent determination of |H 0 | 2 . Perhaps most interesting from the point of view of lepton universality studies is that the ratio of the eighth line of Table I to the first, whose integrals differ only in the θ weighting, gives a unique determination of the lepton mass parameter ε. To be explicit, first integrate to obtain which is not the same as dΓ/dq 2 d cos θ , due to the presence of the extra θ V -dependent term. Then one finds The same relations have been used to a rather different effect in Eqs. (15)- (16).

B. The Decays P → P ν
The much simpler class of decays P → P ν , where P like P is also a pseudoscalar meson, is presented here, following the more complicated class P → V ν , V → P 1 P 2 , because the relevant partial-wave expressions can be deduced almost immediately from the previous case. One notes that since the P is spinless, the W * can couple only through its helicity-0 states: the J = 1 component that couples to the helicity amplitude H 0 , and the J = 0 component that couples to the helicity amplitude H t . To be specific, the form factors for the transition of a pseudoscalar meson P (mass M , momentum p) to a pseudoscalar meson P (mass m, momentum p ) are defined as [34] Then the helicity amplitudes are given by [35] where the combination f 0 is defined in Ref. [34]. Note particularly that the same names H 0 , H t are used here for the helicity amplitudes of P → P ν as for P → V ν , V → P 1 P 2 , even though they refer to distinct hadronic quantities in the two cases. The label V in the momentum p V defined in Eq. (19) now refers to P in this subsection. The full differential rate for P → P ν depends only upon two variables, namely, q 2 and θ , where θ is defined precisely as in Fig. 1. One may obtain the differential rate simply by taking the expression in Eq. (20) and setting H + = 0, H − = 0, B(V → P 1 P 2 ) = 1, and integrating over the full ranges of d cos θ V and dχ. 5 One obtains Clearly, being able to use the same names H 0 , H t for both P → P ν and P → V ν , V → P 1 P 2 in the reduction of Eq. (20) means that the helicity amplitudes must have the correct relative normalization. One may also integrate over the full range of θ to obtain   w0(θ , θV, χ) integrated against the full 4-fold differential width Eq. (20) for processes P → V ν , V → P1P2 in the manner described in Eq. (25). They apply to cases where V decays to a state of total spin-projection zero along the decay axis.
The particular weight functions w 0 (θ ) analogous to those in Table I are defined as ones that extract simple helicity amplitude combinations when performing integrals analogous to those in Eq. (25): The required weight functions w 0 (θ ) and the 3 independent simple combinations of helicity amplitudes that can be extracted are listed in Table II. One notes that these combinations are precisely the subset of those in Table I depending only upon H 0 and H t (although, again, they refer here to P → P and not P → V transitions).
The corresponding results for P → V ν, V → − + can be obtained in an analogous way. Gone is the simplification of the previous case, in which the spinless P 1 and P 2 both have zero helicity. However, in the physically relevant case of B c → J/ψ ν, J/ψ →˜ −˜ + , the J/ψ is too light to decay to τ + τ − , while for˜ = µ (the experimentally favored channel for reconstruction of a J/ψ), one has (m µ /m J/ψ ) 2 = 1.16 · 10 −3 : The outgoing µ pair are almost pure helicity eigenstates, a restriction that reduces the angular analysis to be almost as straightforward as in the previous section. We thus ignore m µ in the decay of J/ψ but retain m from the semileptonic decay.
The expansion of Eq.  Table I apply equally well for the two σ = 0 cases. Note the identification of P 1 →˜ − , as in Fig. 1, for the purpose of defining scattering angles.
The occurrence R H i = 1 for some ratio i does not necessarily imply lepton-universality violation, but it does require BSM of some form that acts differently for different final-state leptons. If one attributes the current tension in the measured ratios R(H) to BSM, our tests provide a deeper level of information. Either at least one of the R H i must differ from unity, thereby suggesting the structure of the BSM physics based upon which helicity combination exhibits this signal; or else no non-unity R H i is found, in which case the BSM must reside in the q 2 ≤ m 2 τ muon data (i.e., the nonuniversal portion of the lepton phase space). In that scenario, other muonic tests like Eq. (15)-a single-lepton flavor test that uses the entire phase space-or (g − 2) µ can provide constraints.