Impact of unidentified light charged hadron data on the determination of pion fragmentation functions

In this paper a new comprehensive analysis of parton-to-pion fragmentation functions (FFs) is performed for the first time by including all experimental data sets on single inclusive pion as well as unidentified light charged hadron production in electron-positron ($e^+e^-$) annihilation. We determine the pion FFs along with their uncertainties using the standard"Hessian"technique at next-to-leading order (NLO) and next-to-next-to leading order (NNLO) in perturabative QCD. It is shown that the determination of pion FFs using simultaneously the data sets from pion and unidentified light charged hadron productions leads to the reduction of all pion FFs uncertainties especially for the case of strange quark and gluon FFs by significant factors. In this study, we have quantified the constraints that these data sets could impose on the extracted pion FFs. Our results also illustrate the significant improvement in the precision of FFs fits achievable by inclusion of higher order corrections. The improvements on both FFs uncertainties as well as fit quality have been clearly discussed.


I. INTRODUCTION
Essential ingredients of theoretical predictions for the present or future hadron colliders such as the large hadron collider (LHC) and large hadron-electron collider (LHeC), are the detailed understanding of the quark and gluon structure of the nucleon [1][2][3][4]. These are quantified by the parton distribution functions (PDFs) [5][6][7] as well as the fragmentation functions (FFs) [8][9][10][11][12][13][14][15][16][17][18][19][20]. In recent years, precise determination of PDFs as well as FFs including their experimental uncertainties had become an active topic for many LHC processes, including top-quark and Higgs boson sector, searches for new heavy beyond the Standard Model (BSM) particles, searches for new physics (NP) as well as in the measurement of fundamental SM parameters such as the strong coupling constant. For more details, we refer the readers to the literature [21][22][23] and a recent study on the PDFs at the High-Luminosity LHC (HL-LHC) [1].
In a hard-scattering collision, PDFs determine how the proton's momentum is shared among its constituents. Likewise, the FFs describe the probability density for the fragmentation of the final-state parton with a certain momentum into the hadron with a fraction of the parton's momentum. PDFs and the FFs depend on the factorization scale. This dependence is described by the DGLAP evolution equations [24][25][26][27], which allow the calculation of the PDFs and FFs, if they are known at a given initial scale, i.e. µ 2 = µ 2 0 . It is well known that the PDFs and the FFs can not be calculable in perturbation theory, and hence, these distributions need to be extracted from experimental information through a QCD fit. In addition, these non-perturbative functions are also universal. The universality of PDFs and FFs commonly refers that, since the hadronization processes are not sensitive to the particular choices of hard scattering process in short range, these non-perturbative functions can be extracted from certain kind of scattering experimental observables. Then the extracted distributions can be used for the theory predictions of scattering observable in high energy collisions.
The new and precise data sets are vital for the precise determination of FFs. These data sets have been and currently been collected from different high energy processes at variety of lepton and hadron colliders. These processes include the hadron production data in single-inclusive electron-positron (e + e − ) annihilation (SIA), semi-inclusive deep inelastic scattering (SIDIS), and proton-proton and proton-antiproton collisions measured by TEVATRON, RHIC and LHC. For a list of all available data sets, we refer the readers to the recent analysis by NNPDF collaboration and references therein [8,9]. Several analyses have been done so far to extract FFs using the observables mentioned above. Among them are the recent determination of charged hadron FFs from collider data by NNPDF collaboration, NNFF1.1h [8]. This collaboration also have determined the pions, kaons, and proton FFs using the SIA data sets at NNLO in perturbative QCD based on the NNPDF methodology, NNFF1.0 [9]. The recent analyses by HKKS16 [13] and JAM16 [28] also have been performed using the SIA data only. Other analyses in literature can be found for example in Refs. [29][30][31][32][33][34][35][36] Recently, we also have performed the First determination of D * ± -meson FFs and their uncertainties at NNLO, SKM18 [10]. In Ref. [11] we presented our QCD analysis of charged hadron FFs and their uncertainties at NLO and NNLO (SGK18) which is the first determination of light charged hadron FFs at NNLO accuracy. Finally in Ref. [12] the contributions from residual light charged hadrons in the inclusive charged hadrons have been extracted using the e + e − annihilation data sets. Since the QCD framework for FFs at NNLO are not accessible for SIDIS, and hadron-hadron collisions, both of our analyses are restricted to the single-inclusive charged hadron production in electron-positron annihilation. The uncertainties in our recent analyses on FFs as well as the corresponding observables are estimated using the "Hessian" technique.
In this work, an extraction of pion FFs from QCD analysis of electron-positron annihilation experimental data in zero-mass variable flavor number scheme (ZM-VFNS) has been presented. The main aim of this paper is to examine, for the first time, the impact of unidentified light charged hadron experimental data on the determination of pion FFs and their uncertainties at NLO and NNLO accuracy. In this respect, we have attempted a determination of pion FFs considering two different scenarios. First, we present a determination of pion FFs through a QCD analysis of pion data sets. In this first study of FFs, which is performed within ZM-VFNS at both NLO and NNLO approximations and referred to as "pion fit", we simplify the analysis by considering the pion data sets only. Secondly, we determine pion FFs through a QCD analysis by including both pion and unidentified light charged hadron data sets. We show that the fitting simultaneously the pion FFs using both data sets leads to a well-constrained determination of pion FFs including significant effect on the extracted uncertainties. Our second fit entitled as "pion+hadron fit".
The outline of this paper is as follows. In section II, we present in details all available SIA data sets for pion production as well as the SIA data sets for the unidentified light charged hadrons. In Section III, we discuss the theoretical formalism of single-hadron inclusive production in electron-positron (e + e − ) annihilation. This section also includes the detailed discussions of our fitting process and parameterization for the pion FFs.
Section IV is then dedicated to our results. The obtained results are clearly discussed for variety of aspect in this section, and comparison with other analyses in literature also presented. This section also includes our theory predictions based on the extracted pion FFs including a comparison with all data analyzed. Finally, Section V includes a summary and our conclusions.

II. EXPERIMENTAL DATA SELECTION
In this section, we present the experimental data sets that are included in our "pion fit" and "pion+hadron fit" analyses. As we mentioned in the Introduction, our QCD fits are performed by inducing the electron-positron annihilation data in two scenarios: In the first analysis, we use the available SIA data for pion from Refs. [37][38][39][40][41][42][43][44][45][46] to extract the pion FFs. In the second analysis, the SIA data sets for the unidentified charged hadrons [41,44,[46][47][48][49][50][51] along with the pion data sets are included in our fits to calculate the FFs of pion. All the data sets for pion and unidentified hadrons are listed in Tables. I and   II for inclusive and flavor-tagged SIA data which are reported by different experiments.
Note that, the measured observables for these data sets, specially for pion, are different and a complete explanation about SIA pion data and the relations between the scaling variables are available in related analysis done by NNPDF collaboration in NNFF1.0 [9].
In addition, we have used the unidentified light charged hadron experimental data in our recent study of (SGK18) [11]. The details of corrections to these data sets and the kinematic cuts applied are presented in Ref. [11].
According to the data sets presented in second column of Tables. I and II, the observables are different and provide limited sensitivity to the separation between light and heavy quark FFs due to the flavor-tagged data. Since the gluon receives its leading order (LO) accuracy at O(α ∫ ), the total SIA cross sections are poor to constrain this density.
However, the longitudinal cross sections can impose a comparable sensitivity to the gluon FF because the longitudinal coefficient functions start at O(α ∫ ). Hence, the longitudinal observables that are available for the unidentified hadrons could constrain the gluon FF well enough. It should be noted that the NNLO QCD corrections for longitudinal structure functions are not available in the literature, and hence, such corrections can not be considered in our analyses.
In this paper, we plan to study the effects arising from the unidentified light charged hadron experimental data on the calculation of pion FFs by including both pion and unidentified hadron data sets, and then, compare the extracted pion FFs with the results calculated from the QCD analysis using pion data sets alone. Since the most contribution of FFs into the unidentified light charged hadron cross sections mainly comes from the identified pion FFs, it motivates us to investigate the effect of unidentified light charged hadron data sets on the reduction of pion FFs uncertainties. In Tables. I and II, our results are reported at NLO and NNLO accuracies of perturbative QCD. In both tables, the forth column presents our fit results for the value of χ 2 per number of data points (χ 2 /N pts. ) considering pion data sets in the fit, while in the fifth column the same quantity are reported considering the pion and light hadron experimental data sets in the analysis.
One of the most important findings from these tables are the significant reduction of χ 2 /do f by going from NLO to the NNLO corrections. We will return to this issue in the next section.
In order to avoid the sensitivity of behaviors of FF parametrization in the low and high regions of z, we apply cuts on the momentum fraction z. We exactly follow the cuts applied in our recent study on light charged hadron FFs, SGK18 [11]. These selections are also imposed for the pion experimental data. For data sets at  Tables. I   and II for the NLO and NNLO accuracy, respectively.

III. THEORETICAL METHODOLOGY FOR CALCULATIONS AND FITTING
In this section, a brief review of the theoretical framework and our methodology has been presented. According to the factorization theorem, the SIA differential cross section normalized to the total cross section 1 σ tot dσ H ± dz at a given center-of-mass energy of This equation is used for identified charged hadrons such as π ± , K ± and p/p and, unidentified hadrons h ± . In Eq. (1), H ± is defined as sum of different charge of hadrons H = H + +H − and z = 2E H √ s is the scaling variable. The total cross section σ tot depends to the perturbative order of QCD corrections and detail explanations can be found, for example, in Ref. [11].
According to the Eq. (1), in the case of multiplicities, the differential cross section for SIA processes can be decomposed into time-like structure functions F T and F L which are the transverse (T) and longitudinal (L) perturbative parts, respectively. The time-like structure functions can be written as convolutions of a perturbative part, coefficient functions C i (z, α s ), and a nonperturbative part, FFs D H ± (z, Q), The coefficient functions have been calculated in Refs. [52][53][54] and they are available up to NNLO accuracy for electron positron annihilations. It should be mentioned here that, in this analysis, the renormalization scale µ R and the factorization scale µ F considered to be equal to the center-of-mass energy of collision, Since the universal FFs are nonperturbative functions, in order to determine the FFs, one needs to parametrize the functions of partons i = q,q, g at a given initial scale. The z parameter represents the fraction of the parton momentum which carried by hadron.
Theoretically, the renormalization equations govern the scale dependence of the FFs and they can be evaluate to a given higher energy scale using the DGLAP evolution equations.
In our analysis, we use the publicly APFEL package [55] in order to calculate of the SIA cross sections as well as the evolution of FFs by DGLAP equations up to NNLO accuracy.
In addition, the ZM-VFNS is considered to account the heavy quarks contributions, and hence, the effects of heavy quark mass are not taken into account in our analysis.
Our main aim in this analysis is to study the effect of adding all the unidentified light Since our aim in this analysis is a new determination of pion FFs D π ± , we use the kaon and proton FFs from NNFF1.0 parton set [9] both at NLO and NNLO accuracies. Recently, we have calculated the residual light hadron FFs D res ± in Ref. [12] up to NNLO QCD correction.
In Ref. [12], we have shown that the contribution of the residual light hadrons are small, and hence, one can ignore this small contribution in Eq. (3). The contribution from this small distribution are not significant for the case of total or light charged cross sections, however, for the case of cand b-tagged cross sections they are sizable.
For the uncertainty from NNFF1.0, we follow the analysis by DSS07 in Ref. [36] and estimate an average uncertainty of 5% in all theoretical calculations of the inclusive charged hadron cross sections stemming from the large uncertainties of kaon and proton FFs from NNFF1.0 set. In addition, our recent study shows that an additional uncertainty due to the contributions of residual charged hadrons FFs [12] also need to be taken into account.
Overall, we believe that a 8% of the cross section value seems to be reasonable. These additional uncertainties are included in the χ 2 minimization procedure for determining the pion FFs. In order to add these uncertainties, we apply such a simplest way to include a "theory" error which we add it in quadrature to the statistical and systematic experimental error in the χ 2 expression. This is the standard approach that one can use to add this 4.92 GeV [8,9], respectively. Also the Z-boson mass is chosen to be M Z = 91.187 GeV and the QCD coupling constant is fixed to the world average α s (M Z ) = 0.1185 [56].
Now we are in a position to present our QCD fit methodology, input functional form as well as the assumptions we used in our analysis to determine the pion FFs. We choose a flexible input parametrization for pion FFs at initial scale Q 0 which we also used in our very recent analysis of unidentified light charged hadrons [11], where i = u + , d + , s + , c + , b + and g, q + = q +q. In order to normalize the parameter N i we use the Euler Beta function B[a, b]. Since we include the FF sets of NNFF1.0 for kaon and proton, we choose the initial scale of energy Q 0 = 5 GeV and therefore the number of active flavors in our analyses need to be fixed at n f = 5. In addition, the charge conjugation and isospin symmetry D π ± u + = D π ± d + are assumed. More specifically, the γ and δ parameters for s + , c + and g could not well constrain by the SIA data and we are forced to fix them as γ s + ,c + ,g = 0 and δ s + ,c + ,g = 0. Then the best fit is only achieved with all five parameters of Eq. (4) for u + and b + . We determine 19 free parameters by a standard χ 2 minimization strategy in which the details can be found in Refs. [11,57].
The free parameters are determined from the best fit, and we list them in Table. III. In the second and third columns of this table, we report our best fit parameters for only pion data analysis at NLO and NNLO accuracy, respectively. The parameters reported by the forth and fifth columns are for the analyses with both pion and unidentified hadron data sets at both perturbative orders.

IV. ANALYSIS RESULTS
After the detailed presenting of the experimental data sets included in the present work and the theoretical and phenomenological framework of the analysis in the previous sections, in the following we present the numerical results obtained for the pion FFs from different analyses and compare them with each other. As we mentioned before, the main goal of the present work is to investigate, for the first time, the impact of unidentified light charged hadron experimental data on the pion FFs at both NLO and NNLO accuracy. In this respect, the pion FFs should be determined by performing two different analyses: 1) determination of pion FFs through a QCD analysis of only pion data sets as usual (pion fit), and 2) determination of pion FFs through a simultaneous analysis of both pion and unidentified light charged hadron data sets (pion+hadron fit).
The important point that should be noted is the presence of the kaon, proton and residual FFs in the theoretical calculation of the unidentified light charged hadron cross sections which is required for the second analysis. As discussed in Sec. III, we use the kaon and proton FFs from the NNFF1.0 analysis [9] and ignore the small residual contribution. Hence, some theoretical uncertainties should be taken into account in the analysis containing the unidentified light charged hadron data. One of the most common methods is adding a point-to-point uncertainty to the experimental data as a systematic error source, 8% in our analyses.

A. Comparison of χ 2 values
The list of experimental data sets including their references as well as the results of our analyses introduced above have been summarized in Tables. I and II at NLO and NNLO, respectively. In each table, the second column indicates the kind of observable measured by each experiment and the third column specifies its related value of center-of-mass energy. Note also that the columns labeled by "pion" and "pion+hadron" are containing the results of the first and second analyses, respectively. The values of χ 2 per number of data points (χ 2 /N pts. ) have been presented in these columns for each data set. Moreover, the value of total χ 2 divide by the number of degrees of freedom (χ 2 /dof) for each analysis is presented in the last raw of the table. The total number of data points included in the "pion fit" analysis is 405, while it is 879 for the "pion+hadron fit" analysis. According to the results obtained, the following conclusions can be drawn. For the case of NLO analyses, although the values of χ 2 /N pts. have increased almost for each pion data set after the inclusion of the unidentified light charged hadron data, but the values of χ 2 /dof for the "pion fit" and "pion+hadron fit" analyses are almost equal. Such behavior is seen Parameter "pion" NLO "pion" NNLO "pion+hadron" NLO "pion+hadron" NNLO for some of the data sets in the case of NNLO analyses, but with the difference that the value of χ 2 /dof has decreased by including the unidentified light charged hadron data in the analysis. Another point should be noted here is the significant reduction in the value of χ 2 /dof when we move from NLO to NNLO. The optimum values of fit parameters have been presented in Table. III, where the first and second columns are related to the pion data analyses at NLO and NNLO, respectively, while the third and fourth columns contain the results of the simultaneous analyses of the pion and hadron data at NLO and NNLO accuracy.

B. Comparison of the relative uncertainties
In order to investigate the impact arising from the inclusion of unidentified light charged hadron experimental data on pion FFs both in behavior and uncertainty, the results obtained from "pion fit" and "pion+hadron fit" can be compared in various ways. One of the best approaches to check the validity and excellency of the new results obtained, specifically in view of the uncertainties, is comparing the relative uncertainties of the extracted distributions which are obtained, for each analysis separately, by dividing the upper and lower bands to the central values. Fig. 1 shows a comparison between the relative uncertainties of pion FFs obtained from the "pion fit" and "pion+hadron fit" analyses at NLO accuracy. We have presented the results for all flavors parameterized in the analysis at the initial scale of Q 0 = 5 GeV. As can bee seen, except for the case of s +s FF, the relative uncertainties of pion FFs obtained from the simultaneous analysis of the pion and hadron data are smaller than those obtained by fitting the pion data alone, especially for the case of gluon FF. In fact, the amount of the uncertainty of s +s FF from "pion+hadron fit" analysis is also less than "pion fit" analysis (as will be shown later), but since its central value is smaller by a factor of two, it has overall a relative uncertainty which is somewhat larger. Fig. 2 shows the same results as Fig. 1, but this time for our NNLO analysis. One can clearly conclude that the inclusion of the unidentified light charged hadron data in the pion FFs analysis at NNLO accuracy can also lead to a smaller relative uncertainty for all flavors. Note that, compared with the NLO results, the relative uncertainty of s +s FF from "pion+hadron fit" analysis has now remarkably decreased at lower z values rather than its distribution from "pion fit" analysis. Overall, the results obtained indicate that by performing a simultaneous analysis of pion and unidentified light charged hadron data, a pion FFs set with more acceptable uncertainties can be obtained at both NLO and NNLO accuracies.
To study the effects of the evolution and also evaluate the results at a given higher Let us focus on each flavor separately to discuss about the changes in more details.
For the case of u +ū FF, no significant change can be seen between the "pion fit" and "pion+hadron fit" analyses. However, both of these analyses have different results than the u+ū FF of NNFF1.0, almost for all values of z. Actually, the difference is more significant at lower values of z and reaches even to 30%. The second panel of Fig. 5 shows that the inclusion of hadron data in the analysis of pion FFs at NLO can put further constraints on s +s FF, especially at medium to small z regions, so that the uncertainty is remarkably  difference that now the discrepancy observed between the s +s and also gluon FFs from "pion fit" and "pion+hadron fit" analyses at medium z regions is more moderate than before. For example, the difference between the gluon FFs obtained from these two analyses at z 0.4 is less than 50% according to the last panel of Fig. 6, while it is more than 100% at NLO (see Fig. 5). Another point should be noted is that the u +ū and c +c FFs remain still unchanged after the inclusion of the unidentified light charged hadron data in the analysis, and the b +b FF is rapidly grown at large z values just similar to NLO case.

C. Comparison of the "pion+hadron fit" at NLO and NNLO accuracy
Considering the "pion+hadron fit" analysis as a final and more excellent analysis to determine the pion FFs from SIA data, it is also of interest to compare the distributions obtained at NLO and NNLO accuracy. A comparison between the NLO and NNLO pion FFs determined from a simultaneous analysis of pion and unidentified light charged hadron data for all flavor distributions at Q 0 = 5 GeV has been shown in Fig. 7 although the uncertainty band of s +s FF at NNLO is bigger than NLO one, but the relative uncertainties of two distributions (similar to Fig. 1) are of the same order.

D. Comparison of the data and theory predictions
Now we are in a position to complete our study of the fit quality as well as the data vs.
theory comparisons.
Here we will focus on the theory prediction based on the extracted pion FFs from our "pion+hadron fit" analysis. We turn to consider only the NNLO results to calculate the normalized cross section for the total, light, c-tagged and b-tagged. To begin with, in Fig. 8, we show the detailed comparisons of 1 σ tot dσ π ± dz with the SIA data sets analyzed in this study. These data sets include the charged pion productions at ALEPH, DELPHI, SLD and OPAL experiments. As we can see from this comparison, the agreement between the analyzed data sets and theoretical predictions for wide range of z are excellent, which show both the validity and the quality of the QCD fits. In Fig. 9, we show the comparison between the NNLO theory based on our "pion+hadron fit" with the charged pion productions at BABAR and BELLE experiments. From the comparisons in this figure, we can see again that the data vs. theory comparisons are excellent.
As a short summary, considering the impact of these two types of data on the pion FFs, shown in plots presented in this section, one sees that in the case of "pion+hadron showing that the inclusion of two data sets simultaneously is somewhat more constraining.

V. SUMMARY AND CONCLUSIONS
In this study, we have quantified the constraints that the unidentified light charged hadron data sets could impose on the determination of pion FFs. To achieve this goal, new determinations of pion FFs at NLO and NNLO QCD corrections have been carried out based on a comprehensive data sets of SIA processes. In this respect, we calculate the pion FFs from QCD analyses of two different data sets. Firstly, the pion FFs are determined through QCD analyses of pion experimental data sets alone, which is referred to as "pion fit". In addition to the determination of pion FFs using pion experimental data sets, one may certainly expects further constraints to become available for pion FFs studies and an improved knowledge of the FFs will become possible from other source of experimental information. Although the data sets of pion production in electron-positron annihilation include inclusive, uds-tagged, c-tagged and b-tagged observables, some of the parameters of pion FFs at initial scale can not be constrained well enough. Since the most contribution of unidentified light charged hadrons cross sections in SIA measurements is related to the identified pion production, one can expects further constraints by adding these data sets into the QCD fits. Hence, to achieve the first and new determination of pion FFs, we have explicitly chosen our input dataset and calculated pion FFs adding simultaneously the pion and unidentified light charged hadron data sets in our analysis, which is entitled as "pion+hadron fit". Our main finding is that using the pion experimental data along with the unidentified light charged hadron data sets has the potential to significantly reduce the pion FFs uncertainties in a wide kinematic range of momentum fraction z.
According to the plots presented in this study, one can clearly sees the reduction of pion FFs uncertainties in almost all range of z. The most effects of adding unidentified light charged hadron data sets in "pion+hadron fit" analysis are seen for the s +s and gluon FFs. Not only the uncertainties of s +s and gluon decrease, but also the behavior of their central values have changed considerably. Consequently, our study shows that applying unidentified light charged hadron observables together with pion production data sets in a calculation of pion FFs leads to somewhat a better fit quality. Since the higher-order corrections are significant, we plan to study the effect arising from higher order correction in the determination of pion FFs. Since we include the SIA data sets in our analyses, the perturbative QCD corrections up to NNLO accuracy can be considered.
We found that our results at NNLO corrections improved the fit quality in comparison to the NLO accuracy and it leads to reduction of the χ 2 for all data sets separately as well as for the total χ 2 . By considering the NNLO corrections, similar slight improvements in the FFs uncertainty are also found in some region of z.
The two analyses presented in this study share, however, a common limitation. In both cases, it has indeed been necessary to include other source of experimental information such as the data from semi-inclusive deep inelastic scattering (SIDIS), and proton-proton and proton-antiproton collisions measured by TEVATRON, RHIC and LHC. However, the NNLO calculations for such processes are not yet available, which would require a relentless effort for the QCD calculations. It is worth mentioning here that our investigations in this study could be extended to the new determination of kaon and proton FFs considering the unidentified light charged hadron data sets as well as the identified charged hadron production observables. More detailed discussions of these new determination of kaon and proton FFs will be presented in our upcoming study.