N = 5, 6, 7, 8: Nested hypothesis tests and truncation dependence of $|V_{cb}|$

The determination of $|V_{cb}|$ from exclusive semileptonic $B\to D^*\ell\nu$ decays is sensitive to the choice of form factor parametrization. Larger $|V_{cb}|$ values are obtained fitting the BGL versus the CLN parametrization to recent Belle measurements. For the BGL parametrization, published fits use different numbers of parameters. We propose a method based on nested hypothesis tests to determine the optimal number of BGL parameters to fit the data, and find that six parameters are optimal to fit the Belle tagged and unfolded measurement. We further explore the differences between fits that use different numbers of parameters. The fits which yield $|V_{cb}|$ values in better agreement with determinations from inclusive semileptonic decays, tend to exhibit tensions with heavy quark symmetry expectations. These have to be resolved before the determinations of $|V_{cb}|$ from exclusive and inclusive decays can be considered understood.


I. INTRODUCTION
In 2017, the Belle Collaboration presented, for the first time, unfolded measurements of the differential decay distributions forB → D Ã lν decays [1], and another measurement appeared more recently [2]. The unfolded measurement [1] permitted outside groups to perform their own fits to the data, using different parametrizations of theB → D Ã lν form factors to extract jV cb j. The choice of form factor parametrizations can have a sizable impact on the extracted value of jV cb j. This is because heavy quark symmetry gives the strongest constraints on the differential rate at zero recoil (maximal dilepton invariant mass, q 2 ) [3][4][5][6][7][8][9][10], resulting in both continuum methods and lattice QCD giving the most precise information on the normalization of the rate at zero recoil. However, phase space vanishes near maximal q 2 as ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi q 2 max − q 2 p , so the measured q 2 spectrum has to be fitted over some range to extract jV cb j. This results in sensitivity to the functional form of the fitted parametrization.
Fitting Belle's unfolded measurement [1] to the BGL parametrization [11,12] yielded higher values of jV cb j [13,14] than fitting the CLN [15] parametrization to the same dataset. (To our knowledge, during 1997-2017, all BABAR and Belle measurements of jV cb j fromB → D Ã lν used the CLN parametrization.) The BGL results are in better agreement with jV cb j extracted from inclusive B → X c lν decays [16]: jV cb j CLN ¼ ð38.2 AE 1.5Þ × 10 −3 ; ½1; ð1aÞ jV cb j BGL 332 ¼ ð41.7 þ2.0 −2.1 Þ × 10 −3 ; ½13; ð1bÞ jV cb j BGL 222 ¼ ð41.9 þ2.0 −1.9 Þ × 10 −3 ; ½14: ð1cÞ Here the BGL ijk notation highlights that these fits have different numbers of parameters (the notation is defined below in Sec. II), in particular eight and six parameters, respectively. In Ref. [2], the Belle Collaboration published an "untagged" measurement ofB → D Ã lν, without fully reconstructing the second B meson in the collision using hadronic decay modes. In that analysis, fits to the CLN and a five-parameter version of the BGL parametrization were performed [2], and the results are in agreement: The BGL method implements constraints on the shapes of the B → D Ã form factors based on analyticity and unitarity [17][18][19]. Three conveniently chosen linear combinations of form factors are expressed in terms of power series in a small conformal parameter, 0 < z ≪ 1. As indicated in Eqs. (1) and (2), there are varying choices for the total number of coefficients, N, in the three power series, ranging from N ¼ 5 [2] to N ¼ 6 [14,20] and N ¼ 8 [13,21,22]. The CLN [15] prescription uses similar analyticity and unitarity constraints on the B → D form factor, heavy quark effective theory (HQET) [7,8] relations between the B → D and B → D Ã form factors, and QCD sum rule calculations [23][24][25] of the order Λ QCD =m c;b subleading Isgur-Wise functions [9,10]. It has four fit parameters. [This version of the CLN parametrization, as used to extract jV cb j, is not self consistent at OðΛ QCD =m c;b Þ [26].] The relation between the above fits is nontrivial, and has not been studied systematically. The goal of this paper is to explore their differences, and to devise a quantitative method to identify the optimal number of parameters in the BGL framework. Using a prescription based on a nested hypothesis test, we find that at least six parameters are required to describe the data from Ref. [1]. The N ¼ 5 and 6 fits we study in detail yield jV cb j values in better agreement with determinations from inclusive semileptonic decays, but they exhibit tensions with expectations from heavy quark symmetry.

II. FORMALISM AND NOTATIONS
The vector and axial-vectorB → D Ã form factors are defined as where v (v 0 ) is the four-velocity of the B (D Ã ). The form factors h V; [3,4]. Each of these form factors can be expanded in powers of Λ QCD =m c;b and α s .
In the massless lepton limit (i.e., l ¼ e or μ), the differential B → D Ã lν rate is given by where r ¼ m D Ã =m B , and F ðwÞ can be written in terms of h A 1 ðwÞ and the two form factor ratios (see, e.g., Ref. [27]): All measurable information is then contained in the three functions F ðwÞ and R 1;2 ðwÞ. Throughout this paper, F ð1Þ ¼ 0.906 [28] and η ew ¼ 1.0066 [29] are used to convert fit results for jV cb jF ð1Þη ew to values of jV cb j. In the heavy quark limit, R 1;2 ðwÞ ¼ 1 þ OðΛ QCD =m c;b ; α s Þ and F ðwÞ ¼ ξðwÞ. Thus, R 1;2 ðwÞ − 1 parametrize deviations from the heavy quark limit. The BGL framework is defined by expanding three form factors g, f, and F 1 , which are linear combinations of those defined in Eq. (3), in power series of the form 1=½P i ðzÞϕ i ðzÞ × P a i n z n , where i ¼ g, f, F 1 (see, e.g., Ref. [12], and note that F 1 ≠ F ). Here z ¼ zðwÞ is a conformal parameter that maps the physical region 1 < w < 1.5 onto 0 < z < 0.056, and P i ðzÞ and ϕ i ðzÞ are known functions [14]. There are two notations in the literature for the coefficients of these power series, which map onto each other via fa n ; b n ; c n g ½14 ⟷ fa g n ; a f n ; a F 1 n g ½13: ð6Þ In the remainder of this paper, we adopt the former notation, so that a n , b n , and c n are the coefficients of g, f, and F 1 , respectively. (The convention for the sign of g, and thus the a n , in Ref. [14] is opposite to that used in Refs. [13,22].) Note that c 0 is fixed by b 0 [12,14], and the fits are performed for the rescaled parameters fã n ;b n ;c n g ¼ η ew jV cb jfa n ; b n ; c n g; ð7Þ and jV cb j is determined by jb 0 j.
To study and distinguish expansions truncated at different orders in z, we denote by BGL n a n b n c a BGL fit with the parameters fa 0;…;n a −1 ; b 0;…;n b −1 ; c 1;…;n c g: ð8Þ The total number of fit parameters is N ¼ n a þ n b þ n c . The BGL parametrization used in Refs. [14,20] is BGL 222 , while that used in Refs. [13,22] is BGL 332 .

III. NESTED HYPOTHESIS TESTS: FIXING THE OPTIMAL NUMBER OF COEFFICIENTS
Our aim is to construct a prescription to determine the optimal number of parameters to fit a given dataset. This can be achieved by use of a nested hypothesis test: a test of an N-parameter fit hypothesis versus a fit using one additional parameter (the alternative hypothesis).
Such a hypothesis test requires an appropriate statistical measure or test statistic. A suitable choice is the difference in χ 2 , The fit with one additional parameter-the (N þ 1)parameter fit-has one fewer degree of freedom (d.o.f.) (number of bins minus the number of parameters). In the large number of d.o.f. limit, Δχ 2 is distributed as a χ 2 with a single d.o.f. [30]. One may reject or accept the alternative hypothesis by choosing a decision boundary. If, for instance, we choose Δχ 2 ¼ 1 as the decision boundary, we would reject the (N þ 1)-parameter hypothesis in favor of the N-parameter fit 68% of the time, if the N-parameter hypothesis is true. We seek a prescription to incrementally apply this nested hypothesis test, starting from a suitably small initial number of parameters (to avoid possible overfitting), until we reach the simplest (smallest-N) fit containing the initial parameters, that is preferred over all hypotheses that nest it or are nested by it. For a set of BGL fits, we thus propose the following prescription starting from a suitable low-N fit BGL n a n b n c : (i) Carry out fits with one parameter added (a "descendant" fit) or, when permitted, removed (a "parent" fit); i.e., for BGL ðn a AE1Þn b n c , BGL n a ðn b AE1Þn c , BGL n a n b ðn c AE1Þ . (ii) For each descendant (parent) hypothesis, accept it over BGL n a n b n c if Δχ 2 is above (below) the decision boundary value. (iii) Repeat (i) and (ii) recursively, until a "stationary" fit is reached, that is preferred over its parents and descendants. (iv) If there are multiple stationary fits, choose the one with the smallest N, then the smallest χ 2 . The optimal truncation order obtained this way depends on the precision of the available experimental data. Our prescription attempts to minimize the residual model dependence (caused by this truncation) with respect to the experimental uncertainty. Figure 1 shows the fitted χ 2 values for the set of 27 different BGL n a n b n c fits with n i ¼ 1, 2, 3. A suitable choice for a starting fit is BGL 111 or one of the three possible fits with N ¼ 4. Using the decision boundary of Δχ 2 > 1, one then obtains a single stationary solution, BGL 222 , shown in bold. For example, one path to BGL 222 is 111 → 211 → 221 → 222, while another is 121 → 131 → 231 → 232 → 222.
Also shown in Fig. 1 are the jV cb j values for all 27 fits. These results are consistent with the statement made in Ref. [13] that the extracted values of jV cb j remain stable when one adds more fit parameters to the BGL 332 fit. This stability can be seen directly by comparing the preferred BGL 222 fit with its descendants. One may notice that the χ 2 of the BGL 333 fit is substantially smaller than those of its parents. However, our procedure starting from N ¼ 3 or 4 fits always terminates before reaching so many parameters. Plotting the fitted BGL 333 distributions, one sees that its small χ 2 is due to fitting fluctuations in the data, and should be seen as an overfit.
The unitarity constraints, P ∞ n¼0 ja n j 2 ≤ 1 and P ∞ n¼0 ðjb n j 2 þ jc n j 2 Þ ≤ 1, can be imposed on the fits. The stationary fit in our approach, BGL 222 , is far from saturating these bounds [14]. While the form factors must obey the unitarity constraints, statistical fluctuations in their binned measurements may cause the central values to appear to violate unitarity 1 (at a modest confidence level). This can occur because such fits may yield large coefficients for higher-order terms to accommodate "wiggles" in the data. In this paper, we do not impose unitarity as a constraint; fits whose central values violate unitarity (at a modest confidence level) may suggest an overfit. This is the case for the BGL 333 fit, providing another reason to limit the number of fit coefficients, as proposed in our method.

IV. COMPARING N = 5 FITS WITH BGL 222
To explore the differences between the various fiveparameter fits and the BGL 222 fit, we perform such fits to Belle's unfolded data [1]. (The untagged Belle measurement [2] is not unfolded, and cannot be analyzed at this point outside the Belle framework. With limited statistics, FIG. 1. The χ 2 (upper entry) and jV cb j × 10 3 (lower entry) values for the BGL n a n b n c fits used for the nested hypothesis test. The number of free parameters in a given fit is N ¼ n a þ n b þ n c and the bold entry is the selected BGL 222 hypothesis fa 0 ; a 1 ; b 0 ; b 1 ; c 1 ; c 2 g. Cells corresponding to N ¼ 5, 6, 7, 8 are highlighted blue, green, orange, and red, respectively. 1 We thank Paolo Gambino for raising this question. the differences between the fits we perform on the unfolded data contain fluctuations, which are different from those of the folded measurement.) There are six possible fits with five parameters, as shown in Fig. 1. Here we focus on comparing BGL 122 , BGL 212 , and BGL 221 , which set a 1 , b 1 , or c 2 , respectively, to zero. (We do not study further the BGL 311 , BGL 131 , and BGL 113 fits, as each removes two and adds one parameter to the BGL 222 fit.) The results of the BGL 222 fit and the three five-parameter fits for the physical observables jV cb j, R 1;2 ð1Þ, and R 0 1;2 ð1Þ are shown in Table I. (Our BGL 222 fit results vary slightly from those in Ref. [20], due to using m B ¼ 5.280 GeV versus 5.279 GeV.) The best-fit parameters [rescaled as in Eq. (7)] and correlations for these four fits are shown in Fig. 2.
The results for the BGL 222 fit in Fig. 2 suggest that, if one wants to reduce the number of fit parameters from six to five, the BGL 122 fit might be the least optimal choice, as the significance of a nonzero value for ja 1 j is greater than for jb 1 j, which is turn greater than for jc 2 j. This is in line with the observation that, compared to the BGL 222 fit, the value of χ 2 increases the most for BGL 122 , followed by BGL 212 , and then BGL 221 . This suggests that among the five-parameter fits, setting c 2 ¼ 0 (the BGL 221 fit) may instead be the preferred option-though inferior, according to our method, to the BGL 222 fit for the Belle tagged and unfolded dataset [1]. The top row in Fig. 3 shows F ðwÞ normalized to the lattice QCD value of F ð1Þ, as jV cb jF ðwÞ=F ð1Þ for six fits. The left-side plots show three previously published fits: the BGL 222 and CLN fit results, based on the 2017 Belle tagged measurement, and the "BLPR" result of Ref. [26], which performed a HQET-based fit to both B → D Ã lν and B → Dlν data to determine the subleading OðΛ QCD =m c;b Þ Isgur-Wise functions, using also lattice QCD information.
The right-side plots in Fig. 3 show the BGL 122 , BGL 212 , and BGL 221 fits, based on the 2017 Belle tagged measurement [1]. The shaded bands indicate the uncertainties. The BGL 222 and BGL 221 fits have the largest differential rates near zero recoil (w ¼ 1), corresponding to the largest extracted values of jV cb j.
The value of jV cb j extracted from the BGL 122 fit to the 2017 Belle unfolded measurement [1] is more than 1σ smaller than in the six-parameter BGL 222 fit to the same data. This raises several questions: Would a BGL 222 fit to the 2018 Belle measurement [2] find a larger value of jV cb j than that in Eq. (2b), closer to its inclusive determination? The consistency of the fitted BGL 122 coefficients from the 2017 and 2018 Belle measurements is only at about the 2σ level forã 0 .
Also shown in Fig. 3 are the fit results for the form factor ratios R 1;2 ðwÞ. The BGL 222 fit to the tagged Belle measurement [1] indicated a substantial deviation from heavy quark symmetry, in particular for the R 1 form factor ratio [20]. The central values, for fixed quark mass parameters, at order OðΛ QCD =m c;b ; α s Þ, are [20] where ηðwÞ is a ratio of a subleading and the leading Isgur-Wise function. With ηð1Þ and η 0 ð1Þ of order unity, R 1 ð1Þ cannot be much below 1, and jR 0 1 ð1Þj cannot be large, without a breakdown of heavy quark symmetry. Preliminary lattice QCD calculations [31,32] also do not indicate Oð1Þ violations of heavy quark symmetry. Figure 3 shows that the BGL 122 fit exhibits better agreement with heavy quark symmetry expectations for R 1 ðwÞ. However, this likely arises because R 1 ðwÞ ∝ ðw þ 1Þg=f, so setting a 1 ¼ 0 constrains the shape of the numerator. By contrast, the BGL 212 , BGL 221 , and BGL 222 fits prefer a 1 ≠ 0, and yield R 1 ðwÞ in some tension with heavy quark symmetry and lattice QCD.

V. TOY STUDIES
To validate the prescription outlined above, and to demonstrate that it yields an unbiased value of jV cb j, we carried out a toy MC study using ensembles of pseudodata sets. These were generated using the BGL 333 parametrization, i.e., with nine coefficients. The six lower-order coefficients fã 0;1 ;b 0;1 ;c 1;2 g were chosen to be identical to the BGL 222 fit results of Fig. 2. The third-order terms fã 2 ;b 2 ;c 3 g were chosen according to two different scenarios: Either 1 or 10 times the size of the fã 1 ;b 1 ;c 2 g coefficients in the BGL 222 fit, as shown in Table II. We call these the "1×" and "10×" scenarios, respectively. Ensembles were constructed as follows: First, predictions for the 40 bins of the tagged measurement [1] were produced. Ensembles of pseudodata sets were then generated using the full experimental covariance, assuming Gaussian errors, and then each pseudodata set was fit according to the nested hypothesis test prescription.
The frequency with which particular BGL ijk parametrizations are selected are shown in Table III, for both the FIG. 4. The pull constructed from a large ensemble of pseudoexperiments using third-order terms of the 1× scenario (left plot) and 10× scenario (right plot) described in the text. The pull of the fits selected by the nested hypothesis prescription (black) show no bias or undercoverage of uncertainties. Also shown in red is the pull from a BGL 122 fit, showing a large bias on the value of jV cb j. Mean (μ) and standard deviation (σ) from normal distributions fitted to the ensembles are also provided.
1× and 10× scenarios. For each selected fit hypothesis, the recovered value, jV cb j rec , and the associated uncertainty, σ, may then be used to construct a pull, i.e., the normalized difference ðjV cb j rec − jV cb j true Þ=σ, where jV cb j true is the "true" value used to construct the ensembles. If a fit or a procedure is unbiased, the corresponding pull distribution should follow a standard normal distribution (mean of zero, standard deviation of unity). In Fig. 4, the pull distributions for both the 1× and 10× scenarios are shown and compared to that of the BGL 122 parametrization. One sees that the nested hypothesis test proposed in this paper selects fit hypotheses that provide unbiased values for jV cb j in both scenarios. However, the BGL 122 fit shows significant biases. In the ensemble tests, the BGL 122 fits have mean χ 2 values of 41.0 and 56.6, respectively (with 35 d.o.f.). For the 1× scenario, this produces an acceptable fit probability on average. Nonetheless, the recovered value of jV cb j is biased by about 1.3σ.

VI. CONCLUSIONS
We studied the differences of the determinations of jV cb j from exclusive semileptonic B → D Ã lν decays, depending on the truncation order of the BGL parametrization of the form factors used to fit the measured differential decay distributions. Since the 2018 untagged Belle measurement [2] used a five-parameter BGL fit, Refs. [14,20] used a sixparameter fit, and Refs. [13,22] used an eight-parameter one, we explored differences between the five-, six-, seven-, and eight-parameter fits.
We proposed using nested hypothesis tests to determine the optimal number of fit parameters. For the 2017 Belle analysis [1], six parameters are preferred. Including additional fit parameters only improves χ 2 marginally. Comparing the result of the BGL 122 fit used in the 2018 untagged Belle analysis [2] to the corresponding fit to the 2017 tagged Belle measurement [1], up to 2σ differences occur, including in the values of jV cb j. This indicates that more precise measurements are needed to resolve tensions between various jV cb j determinations, and that the truncation order of the BGL expansion of the form factors has to be chosen with care, based on data.
We look forward to more precise experimental measurements, more complete fit studies inside the experimental analysis frameworks, as well as better understanding of the composition of the inclusive semileptonic rate as a sum of exclusive channels [33,34]. Improved lattice QCD results, including finalizing the form factor calculations in the full w range [31,32], are also expected to be forthcoming. These should all contribute to a better understanding of the determinations of jV cb j from exclusive and inclusive semileptonic decays, which is important for CKM fits, new physics sensitivity, ϵ K , and rare decays.