Renormalons in integrated spectral function moments and $\alpha_s$ extractions

Precise extractions of $\alpha_s$ from $\tau\to {\rm (hadrons)}+\nu_\tau$ and from $e^+e^-\to {\rm (hadrons)}$ below the charm threshold rely on finite energy sum rules (FESRs) where the experimental side is given by integrated spectral function moments. Here we study the renormalons that appear in the Borel transform of polynomial moments in the large-$\beta_0$ limit and in full QCD. In large-$\beta_0$, we establish a direct connection between the renormalons and the perturbative behaviour of moments often employed in the literature. The leading IR singularity is particularly prominent and is behind the fate of moments whose perturbative series are unstable, while those with good perturbative behaviour benefit from partial cancellations of renormalon singularities. The conclusions can be extended to QCD through a convenient scheme transformation to the $C$-scheme together with the use of a modified Borel transform which make the results particularly simple; the leading IR singularity becomes a simple pole, as in large-$\beta_0$. Finally, for the moments that display good perturbative behaviour, we discuss an optimized truncation based on renormalisation scheme (or scale) variation. Our results allow for a deeper understanding of the perturbative behaviour of integrated spectral function moments and provide theoretical support for low-$Q^2$ $\alpha_s$ determinations.


Introduction
Extractions of the strong coupling, α s , at lower energies can be very precise due to increased sensitivity to the higher-order corrections, as long as the non-perturbative contributions are under good control. The prominent example of this type of α s determination is the extraction from inclusive hadronic decays of the τ lepton, which have been used since the 90s as a reliable source of information about QCD dynamics [1,2]. Although the decay rate receives a non-negligible contribution from non-perturbative effects, it is largely dominated by perturbative QCD, which renders feasible a competitive extraction of the strong coupling [3][4][5][6]. Recently, a similar α s determination was introduced [7] making use of a compilation of data for e + e − → (hadrons) below the charm threshold [8]. An attractive feature of this new analysis is that the systematics is under very good control, although the error due to the data is still somewhat large. 1 Both analyses rely on finite energy sum rules (FESR) where, on the experimental side, one has weighted integrals of the experimentally accessible hadronic spectral functions. Exploiting the analyticity properties of the quark-current correlators one is able to express the theoretical counterpart of the sum rules as an integral in a closed contour on the complex plane of the variable s -which represents the invariant mass of the final-state hadrons -thereby circumventing the breakdown of perturbative QCD at low energies. In this framework, the perturbative contribution is obtained from the complex integration of the Adler function in the chiral limit, which nowadays is exactly known up to α 4 s [10,11]. When performing this integration, one must adopt a procedure to set the renormalisation scale. The two most widely used ones are Fixed Order Perturbation Theory (FOPT) [12], in which the scale is kept fixed, and Contour Improved Perturbation Theory (CIPT) [13,14], where the scale varies along the contour resumming the running of the coupling. The procedures lead to different series and to values of α s that are different. This difference remains one of the dominant uncertainties in the α s extraction from τ decays [4,6]. In the case of e + e − → (hadrons), the difference is significantly smaller, but still non-negligible [7].
In the discussion of perturbative expansions in QCD one must take into account a basic but important fact: the perturbative series are divergent and, at best, they are asymptotic expansions -as discovered by Dyson in the context of QED in 1952 [15]. The series is better understood in terms of its Borel transform, which suppresses the factorial growth of the perturbative coefficients and allows for an understanding of the higher-order behaviour in terms of singularities along the real axis in the Borel plane. These singularities are the renormalons of perturbation theory [16].
An optimal use of an asymptotic series of this type can be achieved (most often) by truncating it at the smallest term [17]. In this procedure, the error one makes is parametrically of the form e −p/α where p > 0 is a constant and α the expansion parameter. In QCD, the expansion parameter, α s (Q 2 ), runs logarithmically which implies that the truncation error is ∼ Λ 2 QCD /Q 2 p , where Q 2 is the Euclidian momentum. These power corrections are a necessary feature of perturbative QCD and are, of course, related to the higher-dimension terms in the Operator Product Expansion (OPE). In the Borel plane, their manifestation is the appearance of renormalon singularities along the real axis at specific locations related to their dimensionality.
In realistic α s analyses the non-perturbative contributions must be taken into account. These include the OPE condensates as well as duality violations (DVs) which are due to resonances and are not encoded in the OPE expansion [18][19][20][21] . In order to extract from the data α s and the non-perturbative parameters in a self-consistent way, i.e. without relying on external information, one resorts to the use of several (pseudo) observables. Those are built using the fact that any analytic weight function gives rise to a valid FESR, with an experimental side that can be computed from the empirical spectral functions and a theoretical counterpart that can be obtained from the integral along the complex contour.
The main guiding principle behind the judicious choice of weight functions that enter a given analysis has been, for a long time, the suppression of non-perturbative contributions. The different analyses of hadronic tau decay data can be divided into two categories. In one of the analysis strategies, one strongly suppresses the poorly known higher-order OPE condensates [4,7]. In this case, duality violations are larger and one must include them; this is done relying on a parametrization that can be connected with fundamental properties of QCD [21]. In the other, only moments that suppress duality violations are used [5,6]. The price to pay in this case is the contamination of the results by the neglected higher-order OPE condensates [22]. Apart from issues related to non-perturbative contaminations, since the work of Ref. [23], it is known that the different weight functions lead to distinct perturbative series that are not equally well behaved. Some of those used in the literature [5,6] have a poor perturbative convergence and are therefore not the ideal choice in precise α s analyses.
The main purpose of this work is to understand the perturbative behaviour of the different integrated spectral function moments at intermediate and high orders by studying the renormalon singularities appearing in their Borel transform. The perturbative behaviour of the different moments is intricate, in fact, each of the moments is a different asymptotic expansion with different renormalon contributions and conclusions about their perturbative behaviour have to be drawn almost case by case. As is customary, we will use the large-β 0 limit of QCD as a guide. In this limit, all renormalon singularities are double poles, with the exception of the leading IR singularity, which is simple. A number of facts can be established. First, the Borel transform of polynomial moments of the Adler function is always less singular than the Borel transform of the Adler function itself. An infinite number of renormalon poles become simple poles. Second, the renormalon poles corresponding to the OPE condensate(s) to which the moment is maximally sensitive are not reduced (or cancelled). What we mean by "maximally sensitive" will become clearer in the remainder, but these two facts are enough to draw interesting conclusions about the behaviour of the different moments and help explaining the instabilities (or "run-away behaviour") identified in Ref. [23]. We will show that the leading IR renormalon is largely responsible for the unstable behaviour of moments that are highly sensitive to the gluon condensate. We also show that an absence of the leading IR pole and partial cancellations of the renormalon singularities are behind the good perturbative behaviour of some of the moments.
Turning to QCD, the situation is more complicated, mainly because the renormalon poles become branch points. The Borel transform has superimposed branch cuts. We will show that most of the difficulties in QCD can be circumvented by a convenient scheme transformation, to the so-called C scheme [24], together with the use of a modified Borel transform introduced in Ref. [25]. 2 In this framework, the Borel transform of the moments can be calculated exactly in terms of the Borel transform of the Adler function -which is one of the main results of this paper, Eq. (41). The parallel with the large-β 0 limit is apparent and the results are formally identical. In fact, we show that the leading IR singularity is also a simple pole in this case. The enhancement and suppression of renormalon singularities identified in large-β 0 is, therefore, also present in QCD which explains the similarity between the perturbative behaviour of moments in the two cases. We then study the behaviour of a few emblematic moments in QCD, using a recent reconstruction of the higher-order terms based on Padé approximants [27]. Finally, we show how to optimize the truncation of the moments with good perturbative behaviour in the spirit of an asymptotic series exploiting scheme transformations. The procedure we employ has been suggested for the τ hadronic width in Ref. [24] but had never been investigated systematically for different integrated moments.
This work is organised as follows. In Sec. 2, we present the theoretical framework. In Sec. 3, we discuss the renormalon content of polynomial moments and their phenomenological consequences, both in large-β 0 and in QCD. In Sec. 4, we discuss the optimized truncation of the moments with good perturbative behaviour through scheme transformations. In Sec. 5, we present our conclusions. Finally, in App. A we present our conventions for the QCD β function; App. B contains further details about the Borel integral of the moments discussed in this work.

Theoretical framework
In the low-Q 2 α s determinations from hadronic τ decays and from e + e − → (hadrons) one uses FESRs constructed from integrated moments of the experimental hadronic spectral functions. In the case of e + e − → (hadrons), one has access to the electromagnetic vacuum polarization spectral function, which mixes isospin 0 and 1. Below the charm threshold one can safely work in the chiral limit, apart from the inclusion of perturbative corrections arising from the strange-quark mass. In hadronic τ decays, the decay width of the τ lepton into hadrons normalized to the decay width of τ → ν τ e −ν e can be separated experimentally into three distinct components: the vector and axial vector, R τ,V /A , arising from the (ūd)-quark current, and the contributions with net strangeness, intermediated by the (ūs)-quark current. In the extractions of the strong coupling α s , the focus is on the non-strange contributions since they have a smaller contamination from non-perturbative effects and the quark masses in this case can safely be neglected. Since the FESRs we discuss here, and in particular the choice of moments, were primarily introduced in the context of τ decays, we will present them in this context. The translation to e + e − → (hadrons) is straightforward and the perturbative contribution, in particular, is essentially identical [7].
We define a generalized observable R (w i ) τ,V /A (s 0 ) that can be written as a weighted integral over the experimentally accessible spectral functions as 2 The use of modified Borel transforms in combination with the C scheme in similar contexts has been suggested by M. Jamin and S. Peris [26].
where w i (x) is any analytic weight function and x = s/s 0 . The correlators Π formed from the quark currents (1) and with the particular choice of weight function , the hadronic decay width normalized to the decay width of τ − → ν τ e −ν e . In precise extractions of α s from τ decays it has become customary to exploit other analytic weight functions, conveniently chosen in order to suppress or enhance the different contributions to the decay rate. The generalized observable R where N c is the number of colours, S EW is an electroweak correction, and V ud is the quark-mixing matrix element. The perturbative terms are represented by δ tree w i and δ where the sum is done over all the contributions from gauge invariant operators of dimension D. The case D = 0 is the perturbative part and D = 2 are small mass corrections. The first non-perturbative contribution starts at D = 4 and is dominated by the gluon condensate. The s dependence in the Wilson coefficients C D (s) arise from the logarithms in their perturbative description and is higher-order in α s . In the case of the gluon condensate the leading logarithm is known but, to an excellent approximation, the coefficient C 4 (s) can be treated as a constant [28]. Little is known for the logarithms in the higher-dimension condensates, but it is customary, based on the experience with D = 4, to neglect their s dependence as well and treat all C D as effective coefficients with no s dependence. The theoretical treatment of the observables R (w i ) τ,V /A is done in the framework of FESRs, relating the experimental results to counter-clockwise contour integrals along the circle |s| = s 0 in the complex plane of the variable s. To eliminate the conventions related to renormalisation it is convenient to work with the Adler function In terms of the Adler function, the perturbative correction of Eq. (4) can be written as [12] δ (0) where W i (x) = 2 1 x dz w i (z) is the weight function. The reduced Adler function, D, which intervenes in Eq. (7), is defined in order to separate the partonic contribution where Q 2 ≡ −s. Accordingly, the perturbative expansion of the function D starts at order α s and can be written as where a µ = α s (µ)/π. The only independent coefficients in this expansion are the c n,1 ; all the others can be written with the used of Renormalisation Group (RG) equations in terms of the c n,1 and β-function coefficients. At present, the coefficients of the expansion are known up to c 4,1 (five loops) [10,11]. Resumming the logarithms with the choice µ 2 = −s the result is (for n f = 3) c n,1 a n Q = a Q + 1.640 a 2 Q + 6.371 a 3 Q + 49.08 a 4 Q + · · · , from which the known independent coefficients can be read off. (Henceforth we will often omit the subscript "pert" in perturbative quantities.) The perturbative series of Eq. (9) is divergent. It is assumed that it must be an asymptotic series [16] to the true (unknown) value of the function being expanded. The divergence stems from the factorial growth of the c n,1 coefficients at large order and it is, therefore, convenient to work with the Borel-Laplace transform of the series which has a finite radius of convergence and where r n = c n+1,1 /π n+1 . The original expansion is then, by construction, the asymptotic series to the inverse Borel transform (the usual Laplace transform) given by On the assumption that the integral exists, the last equation defines unambiguously the Borel sum of the asymptotic series. However, the divergence of the original series is related to singularities in the t variable known as renormalons. They appear at both positive and negative integer values of the variable u = β 1 t 2π (with the exception of u = 1). In particular, the IR renormalons, that lie on the positive real axis, obstruct the integration in the Borel sum. A prescription to circumvent these poles becomes necessary, which entails an ambiguity in the Borel sum of the series. This remaining ambiguity is expected on general grounds to be cancelled by corresponding ambiguities in the power corrections of the OPE. At large orders, the UV pole at u = −1, being the closest to the origin, dominates the behaviour of the series. The coefficients of the series are, therefore, expected to diverge with sign alternation at sufficiently high orders.
The calculation of the perturbative contribution to FESR observables requires that one performs the integral of Eq. (7). A prescription for the renormalisation scale µ -which enters through the logarithms of Eq. (9) -must be adopted in the process. In the procedure known as Contour-Improved Perturbation Theory (CIPT) [13,14] a running scale µ 2 = Q 2 is adopted and the running of α s is resummed along the contour with the QCD beta function. With this procedure the perturbative contribution is cast as A strict fixed order prescription, known as Fixed Order Perturbation Theory (FOPT) corresponds to the choice of a fixed scale µ = s 0 . The coupling can then be taken outside the integrals which are now performed over the logarithms of Eq. (9) as The FOPT series can be written as an expansion in the coupling as where the coefficients now depend on the choice of weight function. The chosen prescription for the renormalisation group improvement of the series affects, in practice, the precise extraction of the strong coupling from hadronic τ decays. It remains, as of today, one of the main sources of theoretical uncertainty in these α s determinations [4][5][6][7]. The two prescriptions define two different asymptotic series with rather different behaviours. Inevitably, the analysis of the reliability of the two procedures requires knowledge about higher orders of the series. In particular, some of the arguments often put forward in favour of CIPT -in an attempt to leave aside the issue with the higher orders -mention a "radius of convergence" [2,29], a notion that contradicts the fact that the series are both asymptotic.
Here we employ the estimate for the higher-order coefficients of the series obtained from a careful and systematic use of Padé approximants [27]. The results of Ref. [27] are model independent and corroborate to a large extent the results obtained in the context of renormalon models, in which the series is modelled by a small number of dominant renormalon singularities employing the available knowledge about their nature [12,23,30], as well as those obtained from conformal mappings that make use of the location of renormalon sigularities [31]. We will also exploit scheme variations as a method to improve convergence of the perturbative series and discuss their usefulness in realistic extractions of α s from hadronic τ decays.

Renormalons in spectral function moments
Several moments of the spectral functions have been used in low-Q 2 α s determinations from hadronic τ data and e + e − → (hadrons) [4-6, 22, 28]. Since the FESR requires the weight function to be analytic it is customary to employ polynomials, which we denote in terms of their expansions in monomials as Of particular importance are the weight functions that are "pinched", i.e. weight functions that are zero at x = 1, and that have b Another important class of weight functions identified in [23] are those that contain the linear term in x. We will discuss these two classes of moments in detail below. The series for δ (0) w i inherits the divergence of the Adler function expansion and accordingly is also amenable to a treatment in terms of its Borel transform. However, the renormalon content of the Borel transformed δ (0) w i is different from the Adler function counterpart, as we discuss in the remainder of the section.

Results in large-β 0
We start investigating the renormalons in δ (0) w i in the large-β 0 limit of QCD [16]. These results are obtained by first considering a large number of fermion flavours, N f , but keeping N f α s constant. The qq bubble corrections to the gluon propagator are order one in this power counting and must be summed to all orders. This dressed gluon propagator is used to obtain all the leading N f corrections, at every α s order, to a given observable. In the end, N f is replaced by the leading β function coefficient, effectively incorporating a set of non-abelian contributions [32]. Accordingly, the α s evolution is performed at one-loop.
In this limit, the Borel transformed Adler function is known to all orders in perturbation theory and it can be written in a compact form as [16,33,34] where C is a parameter which depends on the renormalisation scheme. For C = 0 we have MS. This result displays explicitly the renormalon poles. They are all double poles with the exception of the leading IR pole at u = 2, which is simple. The IR poles are particularly important in the subsequent discussion, and in particular their connection to OPE condensates. Each of the IR poles that appear at a given position u = p in the Borel transform of the Adler function can be mapped to the existence of contributions of dimension D = 2p in the OPE [16]. This explains, for example, the absence of a pole at u = 1 since there is no gauge-invariant D = 2 condensate in the OPE. This non-trivial connection between perturbative and non-perturbative physics will also be manifest in the Borel transform of δ (0) w i . Using this result, the Borel transform of δ (0) w i can be obtained from Eq. (7) employing the Borel integral representation of the Adler function, Eq. (12). One can then write where we performed the change of variables x = e iφ . In the large-β 0 limit, the β function is truncated at its first term 3 The exponential in Eq. (19) can be written as Inverting the order of integration and using Eq. (12) one can read off the Borel transform of δ The prefactor of Eq. (22) can be obtained analytically for polynomial weight functions. For the monomial w i = x n one finds 4 One immediately sees that the sin(πu) reduces an infinite number of UV and IR double poles in B[ D](u) to simple poles. In this sense, one can say that B[δ (0) w i ] is significantly less singular than the Adler function counterpart, a fact that has been exploited in Ref. [27].
The prefactor of Eq. (23) is also highly non-trivial. It cancels the zero at u = 1 + n in sin(πu), which means that the pole at u = 1 + n of B[ D](u) remains double (or single, in the case of u = 2). This is clearly not a coincidence and is related to the non-perturbative contributions to R τ,V /A . To expose this connection, consider the contribution of D ≥ 4 in the OPE expansion, Eq. (5), to R τ,V /A which can be cast as  For a monomial w i (x) = x n -and to the extent that the s dependence of the coefficients C D can be neglected, as discussed previously -this reduces to For positive integer values of n, the integral in the last equation is only non-vanishing for −n + D/2 = 1. Therefore, as is well known, under these assumptions, for w i = x n the only contribution comes from the condensates with D = 2(n + 1) which, in turn, is related to the pole in the Borel transform of the Adler function at u = n + 1. It becomes apparent that the prefactor of Eq. (23) is not accidental: the pole in B[ D](u) that corresponds to the condensate that contributes maximally to moments of w = x n is not cancelled by the prefactor of Eq. (23). For monomials x n with n ≥ 0 three cases can be distinguished: • If n = 0 all poles become simple poles, since there is no contribution from OPE condensates, apart from the α s -suppressed terms, under the assumptions of Eq. (25). In particular, the pole at u = 2 which was simple is exactly cancelled and the function is regular at u = 2.
• If n = 1, the dominant contribution from the OPE is the one from D = 4. The pole at u = 2 related to this OPE contribution is not canceled and all other IR and UV poles become simple poles. This is a distinct situation because it is the only case where B[δ (0) x n ] is singular at u = 2, in all other cases the leading IR singularity is located at u = 3.
• Finally, if n ≥ 2, all IR poles for u > 2 become simple poles, with the exception of the pole at u = n + 1, which remains double and is now the only double pole in B[δ (0) x n ] -all others are reduced to simple poles by the zeros of sin(πu). In this case, the pole at u = 2 that corresponds to the contributions due to the gluon condensate is again exactly cancelled by sin(πu) and the function is analytic at u = 2.
In Tab. 1, we show the residues of the dominant UV and IR poles for the first six monomials, which are the building blocks for most of the moments used in the literature. Residues of the double poles are shown as boxed numbers. The results of this table can be used to understand a few features of specific cases. For example, in the Borel transform of the kinematic moment, Eq. (17), a partial cancellation of the leading UV renormalon is manifest: its residue is reduced by a factor of 3.3. The perturbative series associated with this moment is expected to display a more tamed behaviour, with the asymptotic nature setting in later.
We are now in a position to reassess some of the findings of Ref. [23] in the light of these results. One of the main observations of Ref. [23] is that the perturbative series for moments of weight functions that contain the monomial x tend to be badly behaved, in the sense that the series never stabilise around the true value of the function, it displays what was called a "run-away behaviour". This can be directly linked to the fact that the Borel transform of these moments are the only ones that have the singularity at u = 2. The contribution of this renormalon to the coefficients of the series is fixed sign and it is large at higher orders.
In order to establish the correspondence between the leading renormalons and the behaviour of the perturbative series, we will make use of an even simpler model. Since the series is dominated by the leading renormalons, we can construct an approximation to the result in large-β 0 using only the leading UV pole and the first two IR poles, which corresponds to truncating the sum in Eq. (18) at its first term. We know from the works of Ref. [12,23,27] that such a minimalistic model should be largely sufficient to capture the main features of the full result in large-β 0 . In Fig. 1 we confirm this expectation by plotting the results for the Adler function in large-β 0 and in its truncated version, normalized to the value of the Borel integral in each case, which removes an overall normalization effect that is immaterial here (throughout this paper we use α s (m 2 τ ) = 0.316(10) [35]). In Fig. 1, one sees that the results are essentially identical for our purposes. However, the simplicity of this model prevents the study of moments that are maximally sensitive to condensates with dimension D ≥ 8, because the corresponding renormalon poles are not included.
We start by considering the moment of w(x) = x. The perturbative expansion of δ (0) x in the truncated model for FOPT and CIPT are displayed in Fig. 2(a). The FOPT series shows the "run-away behaviour" identified in Ref. [23]. It overshoots the true value, at first, and later crosses it and runs into the asymptotic regime with almost no stable region. The CIPT series is better behaved but also overshoots the true value and then runs into the asymptotic behaviour, with sign alternating coefficients, much earlier than FOPT. To understand this pattern we can use the Borel transform of δ (0) x which is rather simple in the truncated model, The result exhibits the UV pole at u = −1, as well as the IR poles at u = 2 and u = 3. All poles are simple due to the zeros of the sin(πu) in the prefactor. We also note that the Borel transform has a regular part, which stems from the first term within square brackets (the would-be pole at zero is also canceled by the prefactor). In Fig. 2(c) we show the breakdown of the different contributions to the perturbative series in FOPT. The series is dominated by the regular contribution which initially overshoots the true value. At higher orders, the first IR and UV poles dictate the tendency and the series never stabilizes around the true value. The IR contribution is negative and is responsible for the run-away behaviour, with a superimposed sign alternation from the UV pole. We now turn to the pinched moments without the term in x. In Fig. 2(b), we show the results for w = 1 − x 2 . The Borel transform of δ 1−x 2 is regular at u = 2 and the leading UV pole is partially cancelled, as we can infer from the results of Tab. 1. This translates into a smoother series. Now the FOPT series nicely approaches the true value and remains stable around it for several orders until eventually entering the asymptotic regime, when the leading UV pole takes over. The result for CIPT, on the other hand, is less accurate (red dashed line in Fig. 2(b)). It approaches the Borel sum of the series only when the asymptotic behaviour has already set in.
Finally, we comment on the results for w(x) = 1. This moment lies somewhere in between the two extreme cases we discussed above. It also benefits from being regular at u = 2 but the partial cancellation of singularities that happens in pinched moments is not present. In this case, FOPT is able to approach the result although at the expense of overshooting it for the first four orders or so. We omit the plots in this case for the sake of brevity (the result in the context of Borel models can be found in Ref. [23]). One should finally remark that, in general, the perturbative series for δ (0) w i are better behaved than the Adler function series, shown in Fig. 1. This fact is a consequence of the Borel transform of δ (0) w i being significantly less singular than the Adler function counterpart. The sign alternation in the Adler function starts already at O(α 5 s ) and the perturbative series never stabilises around the Borel sum. This does not prevent, however, the pinched moments without the linear term from having a very good perturbative behaviour, as exemplified in Fig. 2(b).

Partial conclusions
We are in a position to draw a few conclusions from the study of spectral function moments in large-β 0 and its truncated form:    w i can be understood in terms of the contributions from the OPE condensates. The Borel transform has a pole at u = 2 if and only if the weight function contains a term proportional to x. The behaviour of the perturbative series associated with these moments is qualitatively different and the true value of series is not well approached neither by FOPT nor by CIPT, as already discussed in Ref. [23].
• The Borel transform of the moments from the monomials w(x) = x n , with n > 1, has only one double pole at u = n + 1, related to the OPE condensate with D = 2(n + 1) to which the moment is maximally sensitive, in the sense of Eq. (25).
• Moments that are pinched and do not contain the term proportional to x are particularly stable. Their Borel transform is regular at u = 2 and there is a partial cancellation of the leading UV renormalon, which translates into a smoother series. The series, in these cases, is well described by FOPT while CIPT struggles to approach the Borel sum and runs into the asymptotic behaviour already at O(α 4 s ) or O(α 5 s ). In the remainder we will discuss the case of QCD. With the use of a convenient scheme transformation and a redefinition of the Borel transform one is able to show that results in QCD are very similar to the ones obtained in large-β 0 .

Results in QCD
Two main ingredients enter the discussion of the previous section. First, we have full knowledge about the renormalon structure of the Adler function. In particular we know which poles exist and if they are double or simple poles, exactly. Second, in the derivation of Eq. (23), because we work in the large-β 0 limit, we made use of the one-loop running of α s . In the case of QCD, on the other hand, we have to be content with a partial knowledge about the renormalon structure of the Adler function. The positions of the singularities are unchanged, but now they are no longer poles and become branch cuts. The running of the coupling is also much more involved when terms beyond one loop are included in the β function. In order to be able to obtain an analytical expression for the Borel transform of δ (0) w i , it is useful to leave the MS scheme and work in another class of schemes which have a particularly simple β function.
Without loss of generality, we will employ the C scheme introduced in Ref. [24] in the derivation we perform below. The implementation of the C scheme is based on the fact that, when going from an input scheme, say the MS, to another scheme that we denote with hatted quantities, the QCD scale parameter Λ changes as [36] where the coefficient c 1 is the first non-trivial coefficient in the perturbative expansion of the couplingâ ≡α s /π in terms of a ≡ α MS s /π: a = a + c 1 a 2 + c 2 a 3 + · · · .
With the expression of the scale-invariant QCD Λ parameter one can then relate the two schemes with a continuous parameter C, that measures the shift in Λ, by where and we have made explicit the renormalisation scale dependence in a Q . A relation that is important in the remainder is the analogue of Eq. (20) in the C scheme which reads In this scheme, the β-function is known exactly and reads This fact enormously simplifies the task of obtaining a closed form for the Borel transform of δ (0) w i in QCD. Finally, we remark that the dependence on the scheme parameter C is, in fact, governed by the same function The coupling becomes smaller for larger values of C and the theory ceases to be perturbative for C ≈ −1.5 (using the MS scheme as input) [24]. This means that the coupling in the C-scheme depends on a particular combination of the scale and scheme parameters α s ≡ α s (Q 2 e C ). Scale and scheme variations become, therefore, completely equivalent. The explicit expressions for the perturbative coefficients relating the MS and the C schemes, together with further details, can be found in the original publications [24,37]. Finally, we remark that there is no value of C that corresponds strictly to the MS, but for C ≈ 0 the results are very similar (at one loop, C = 0 corresponds to the MS exactly). For schemes in which the β-function takes the form of Eq. (32) it is convenient to work with a modified Borel transform defined as [25] whereλ = β 2 /(β 1 π) andĉ n,1 are the Adler function coefficients in the C scheme. With this definition, the Borel sum of the series now reads The asymptotic expansion to the latter result is obtained using Eq. (34) in (35) and gives, as expected [25], The modified Borel transform has renormalon singularites at the same location as the usual Borel transform, but their exponent is shifted. As demonstrated in Ref. [25], if the usual Borel transform has a singularity of the form the modified Borel transforms behaves for u ∼ p as with the exponent of the singularity shifted by 2πp β 1λ = +2p(β 2 /β 2 1 ) and, as before, u = β 1 t 2π . Let us now calculate the modified Borel transform of δ (0) w i in the C scheme. The calculation is very similar to what was done in large-β 0 . Using Eq. (35) into Eq. (7) we obtain With the use of Eq. (31) one finds and inverting the order of the integration in Eq. (39) one obtains, for the monomial weight function w(x) = x n , the following result This shows that the relation of Eq. (23) is, in fact, much more general, since any scheme can be brought to the C scheme without loss of generality. The prefactor is the same in QCD and in large-β 0 , and so is the enhancement of the renormalon associated with the contribution with dimension D = 2(n + 1) in the OPE. 5 The main difference is that in QCD the singularities of B[ D](u) are, in general, branch points and are no longer poles. The exponent of the singularities is related to the anomalous dimension of the associated operator contributing to the OPE.
To make further progress, let us look at the explicit structure of the IR singularities. In the notation of [12], the singularities of the usual Borel transform are written as where the constantsγ andb (p) i depend on the anomalous dimension of the associated operator in the OPE as well as on β-function coefficients. The explicit expression for the exponentγ isγ = 2p where the anomalous dimension associated with the operator O d is defined as For the modified Borel transform we have then 5 An approximate relation between the Borel transformed Adler function and the Borel transform of δ (0) wτ can be found in [31]. The result of Eq. (41) is fully general.
where the first factor in the r.h.s. Eq. (43) is exactly cancelled by the shift in the singularity of Eq. (38). For the discussion of the Borel transformed δ (0) w i it is crucial to inspect the leading IR renormalon. This renormalon is related to the gluon condensate which can be expressed in terms of the scale invariant combination aG 2 [38]. In this case, the leading IR singularity of the Adler function in the C-scheme, and using the modified Borel transform, reduces simply to which is a simple pole, exactly as in the large-β 0 limit. This is remarkable because, with Eq. (41), one can directly translate many of our conclusions from large-β 0 to QCD, in particular, B[δ (0) w i ] has a pole at u = 2 if and only if the weight function contains a term proportional to x. The conclusion that B[δ (0) w i ] is less singular also remains valid, and it is again true that the singularity associated with the contributions in the OPE to which the moment is maximally sensitive are not altered by the prefactor of Eq. (41).
In view of the above discussion, and the results of Eqs. (41) and (46), we learn that the same mechanisms of suppression, enhancement, and cancellation of renormalon singularities identified in large-β 0 are also at work in QCD. Similarities between the results in the two cases were identified in Ref. [23] although the explicit connection with the renormalon singularities of B[δ (0) w i ] was not investigated in that work. Although in QCD we have only partial information about the renormalon singularities, and in particular about their numerator, we are in a position to speculate that the reason behind the good or bad perturbative behaviour of the different moments is rooted in the same interplay between the renormalons of B[δ (0) Here, we study the QCD perturbative series for different moments using the modelindependent reconstruction of the higher-order coefficients of Ref. [27], where the mathematical method of Padé approximants was used to describe the series. In Fig. 3, we show results for four emblematic moments, in FOPT and CIPT, with s 0 = m 2 τ and in the MS; the shaded bands represent an uncertainty that stems from the Padé approximant method, as discussed in [27]. In Figs. 3(a) and 3(b), the results for two moments that display good perturbative behaviour are shown: w τ and w(x) = 1 − x 2 , respectively. The horizontal yellow bands represent an estimate for the Borel integral of the moments within the Padé-approximant description. Again, the FOPT series approaches the true value, as predicted by the Padé approximants, and is rather stable around it until at least the 8-th order. (We relegate to App. B a more detailed discussion about the Borel integrals together with a comparison with results from the model of Refs. [12,23], which are similar to ours.) In Fig. 3(c), we show the results for the monomial w(x) = x, which exacerbates the run-away behaviour that stems from the leading IR singularity, as in the large-β 0 case of Fig. 2(a). Here CIPT is relatively good agreement with the true results, but this is not the case for other moments containing the linear term, such as w(x) = 1 − x, shown in Fig. 3(d). This moment inherits the run-away behaviour of the monomial and both FOPT and CIPT are rather unstable, never stabilizing around the true result.
Finally, one can corroborate our conclusion that the bad perturbative behaviour of moments containing the linear term is related to the leading IR singularity by considering the "alternative model" of Ref. [23]. In this case, a model for the QCD Adler function is constructed without the leading IR singularity. In the C-scheme and using the modified  Perturbative series for four emblematic moments order by order in α s within the higher-order reconstruction of the QCD perturbative series with the use of Padé approximants of Ref. [27], in MS and with s 0 = m 2 τ . The shaded bands represent uncertainties associated with the Padé approximants as discussed in the original reference [27]. In (a) and (b) we show moments with good perturbative behaviour, while in (c) and (d) we display results for moments containing the term x, which show the run-away behaviour of the perturbative series.
Borel transform, this means that, for this model, the Borel transform of δ (0) w i is regular at u = 2, since in the prefactor of Eq. (41) the pole is cancelled by the zero in sin(πu). As shown in [23], the run-away behaviour is not present in this case, which shows, once more, that it stems from the leading IR singularity in B[δ (0) In conclusion, with the use of the C-scheme and the modified Borel transform, the relation between the Borel transform Adler function and the Borel transform of δ (0) w i are formally the same in QCD and in the large-β 0 limit. The singularities related to the contributions in the OPE are equally enhanced or suppressed and, in general, δ (0) w i is significantly less singular than the Adler function. In the case of the leading IR renormalon the parallel is strict since within this framework it is a simple pole both in QCD and in large-β 0 . The phenomenological consequences are then the same: moments with a linear term in x display an unstable perturbative behaviour. Finally, the moments with good perturbative behaviour in large-β 0 are also well behaved in QCD, at least if FOPT is used. The results from the model-independent Padé approximant reconstruction of the series are qualitatively similar to the "Borel model" of Refs. [12,23] which attests the robustness of our conclusions.

Optimal truncation with scheme variations
We close this work with a discussion of the optimal truncation of the (asymptotic) series associated with the moments that display good perturbative behaviour. In Ref. [24,39] it has been suggested that, in the spirit of an asymptotic series, the optimal truncation for the perturbative expansion of the Adler function and of integrated moments is achieved by choosing the scheme (or scale) in which the last known coefficient of the series vanishes. In this case, by construction, the smallest term of the series, which is zero, is precisely the last known term, which makes it the ideal point for the optimal truncation of the asymptotic series. 6 Through this procedure, one expects to make maximum use of the available information from perturbative QCD. Here we show that this type of optimization works very well in FOPT for all the moments with good perturbative behaviour, within the reconstruction of the series provided by the Padé approximants of Ref. [27].
We will work in the C-scheme and perform variations of the continuous scheme parameter C. However, as discussed in Sec. 3.2, in this scheme, scale and scheme transformations are essentially equivalent and the same results can be achieved by renormalisation scale variations.
Let us illustrate the procedure with the help of a concrete case. The FOPT expansion for w(x) = 1 − x 2 and s 0 = m 2 τ in the C-scheme is given by where we show the four exactly known contributions (up to and including α 4 s ) plus the first unknown contribution, proportional to c 5,1 . At order α 5 s , the terms without c 5,1 depend only on β-function coefficients and lower c n,1 and are known exactly. It is customary to include the fifth term in realistic α s analysis through an estimate of c 5,1 which here is taken to be c 5,1 = 277 ± 51 [27] -but we will see that the results do not depend strongly on the value of c 5,1 .
The optimized truncation is obtained then by finding the value(s) of C for which the coefficient ofâ 5 Q vanishes. In the case of 1 − x 2 , for FOPT with our central value for c 5,1 , one finds two such values: C 1 = −1.463 and C 2 = −0.4763. The former leads to a rather unstable series, that we discard, since the coupling is already entering the non-perturbative regime [α s (C 1 , m 2 τ ) = 0.531] while the latter is still in the perturbative regime [α s (C 2 , m 2 τ ) = 0.355] and gives rise to the optimized result. In Fig. 4(a) we compare the optimized series for δ (0) 1−x 2 (green dot-dashed line) with the usual MS result (solid blue line) using the higher-order coefficients and the Borel sum from the description of Ref. [27]. One sees that the optimized FOPT series approaches the true value faster than the MS result, already at O(α 3 s ), and remains rather stable around  it. This optimization is related to the larger value ofα s which leads to a series that "converges" faster than the MS one. 7 With the optimized series, an estimate of the true result is obtained with the truncation at O(α 5 s ) which gives where the error is due to the variation of c 5,1 within one sigma. It is clear from Fig 4(a) that this leads to an excellent agreement with the true result -as predicted from the results of [27] -which reads 0.2364 ± 0.0020. One should also remark that the procedure is rather independent of the value of c 5,1 that is used. An uncertainty due to the value of α s , for example, would be about one order of magnitude larger than the uncertainty shown in Eq. (48). An attempt to apply the same procedure to the CIPT series does not lead to any significant improvement with respect to the (already bad) result obtained in the MS, as shown in the red and purple lines in Fig. 4. The optimization can also be applied to the kinematic moment, w τ . The result is again very good and the acceleration of the series is even more obvious, as displayed in Fig. 4(b). For illustration, we also show, in grey, the series in a scheme with larger value of C, namely C = 0.8 for whichα s (C = 0.8, m 2 τ ) = 0.2554. One sees that in a scheme with a very small value of the coupling the convergence is smooth but very slow for practical purposes, where only the first few terms are available. Similar results can be obtained for the other moments that have a good perturbative behaviour. As an example, in Fig. 4(c) we show the result of the optimization of one of the pinched moments introduced in Ref. [6].
It is also interesting to analyse a borderline case, namely that of w(x) = 1. This is not a moment with a bad perturbative behaviour (it does not have a linear term in x), but it is also not among the most stable peturbative series, since it does not benefit from the partial cancellation of renormalons. The result in this case is shown in Fig. 4(d).
Here the MS series overshoots the true value up to O(α 4 s ), as shown in Fig. 4(d). In this case, the value of C that optimizes the truncation turns out positive and the optimal scheme has a smaller value of α s than in MS. The optimization is achieved by avoiding the overshooting of the true result that is prominent in the MS series. The final result is more stable than that in the MS and one could expect a smaller error from the truncation of the series, but the acceleration is not very significant.
Finally, moments with bad perturbative behaviour do not improve in any significant way when we apply the optimization described here. A more stable perturbative expansion for these moments can be achieved with the method of conformal mappings, making use of the information about the location of the renormalon singularities [40][41][42][43]. Even with this technique, in some cases, the series approaches the true value only at high orders.

Conclusions
In this work, we have discussed in detail the perturbative behaviour of integrated spectral function moments and the connection with the renormalon singularities of their Borel transformed series, denoted B[δ (0) w i ]. The understanding of the perturbative expansion of such moments is important in guiding the choice of moments employed in realistic α s determinations from low-Q 2 FESRs. Moments with tamed perturbative expansions are more reliable and lead to smaller uncertainties from the truncation of perturbation theory.
In large-β 0 , one can easily establish the relation between the renormalons of the Adler function and those of the integrated moments in the MS scheme. An infinite number of renormalon poles of the Adler function is cancelled and B[δ (0) w i ] is significantly less singular. In particular, for polynomial moments, the leading IR pole is exactly cancelled unless the weight function contains a term proportional to x. The weight functions with this term are therefore the only ones that are singular at u = 2 and they display an unstable perturbative behaviour that stems from the contribution of this IR pole to the perturbative series. For the pinched moments that had been identified as having a good perturbative behaviour in Ref. [23], we found additional cancellations of renormalon singularities, which are related to a better behaviour at higher orders and postpone the asymptotic regime of the series.
Using the C scheme and a modified Borel transform we have been able to show, in Eq. (41), that the relation between Borel transformed moments and the Borel transformed Adler function is the same in QCD and in large-β 0 . In Eq. (46), we have also shown that the leading IR singularity in this framework is again a simple pole. These are the main results of this paper since they allow us to conclude that the same mechanisms of enhancement, suppression, and partial cancellation of renormalon singularities responsible for the behaviour of the perturbative moments in large-β 0 are operative in QCD as well. The similar behaviour of the integrated spectral function moments in the two cases is therefore no surprise and again the pinched moments without the linear term are the best ones (as pointed out in Ref. [23]). The instabilities related to the leading IR pole are also present in QCD.
Finally, we have shown that it is possible to use renormalisation scheme (or scale) variations to accelerate the convergence of the moments that display good perturbative behaviour. This had been suggested in Ref. [24] for the R τ ratio but it had never been investigated systematically before.
In conclusion, we have been able to understand the instabilities and stabilities of the perturbative expansions of integrated spectral function moments in terms of their renormalons. Apart from the implications for the choice of moments in precise α s analysis, our results can be used in the context of Borel models for the Adler function, since we have shown that scheme transformations and the modified Borel transformed can be used in order to simplify the structure of the leading IR singularity, related to the gluon condensate, which becomes a simple pole. In fact, the results in large-β 0 and QCD are therefore much more similar than previously thought. Our findings also suggest that alternative expansions that suppress some of the renormalons may lead to much more stable results, and we plan to investigate this issue further in the near future.
where the first five coefficients are known analytically [44,45]. It is important to highlight that β 1 and β 2 are scheme independent and, in our conventions, they are given by with N f being the number of flavours. In the particular case of N f = 3, relevant here, we have β 1 = 9 2 , β 2 = 8.

B Details on the Borel integrals from Padé approximants
In this appendix we discuss in further detail how the Borel integrals, or "true values", of the perturbative series are obtained. We also compare our results with those of Ref. [23].
Our results are based on the reconstruction of the higher-order coefficients performed in Ref. [27], using Padé approximants. Several methods have been studied in [27], using different variants of rational approximants, and constructing the approximants to the Borel transformed Adler function, to the Borel transformed δ wτ . With these coefficients, given in Tab. 6 of Ref. [27], one can obtain the expansion of any moment, in FOPT or CIPT, rather accurately up to order O(α 10 s ). The main advantage of the use of Padé approximants is that the method is almost completely model independent. In this framework, however, no unique representation of the Borel transformed Adler function is obtained, which makes the task of calculating the Borel integrals for each moment less straightforward. In order to estimate the Borel integrals we have constructed new Padé approximants, following the same methods of Ref. [27], to each of the Borel transformed δ (0) w i . In all cases where the moments have good perturbative behaviour, the approximants converge very fast, only three coefficients suffice to obtain a rather stable result. This means that the prediction from these Padé approximants are based only on the exactly known QCD results. From the Borel transform described by the Padés one can then easily calculate the Borel integral. Of course, more than one Padé can be built from the same input and we have constructed many different approximants, belonging to different sequences and also using Dlog Padés [27], in order to estimate the horizontal error band shown in Figs. 3 and 4. The moments containing the linear term x, however, lead to less stable results. In order to obtain a stable description of the Borel transform it is necessary to use more coefficients in the construction of the Padé approximants -which make these results less model independent since they require input from the higher-order coefficients predicted in [27]. The final uncertainties in the horizontal (yellow) bands take into account the dispersion of the results from the use of different Padé approximants as well as the original uncertainty in the prediction of the coefficients of Ref. [27]. In most cases, the former dominates.
Finally, it is interesting to compare our Borel integrals with the ones from the description of Ref. [12,23]. In these works, the Borel transformed Adler function is w(x) = (1-x) 2 (1+2x) Perturbative order Borel integral (Padés) [27] Borel integral (Renormalon model) [23] FOPT CIPT (a) δ (0) , w(x) = (1 − x) 2 (1 + 2x), PAs. Perturbative order Borel integral (Padés) [27] Borel integral (Renormalon model) [23] FOPT CIPT (b) δ (0) , w(x) = 1 − x, PAs. modelled with its first three dominant renormalons, the leading UV and the first two IR singularities. The residues of the singularities are fixed such as to reproduce the known QCD results. The main advantage of this procedure is that one obtains a unique description of the Borel Adler function, from which all the results are derived. The disadvantage is a possible residual model dependence which could lead to unaccounted systematics. The results from the Padés are, however, in very good agreement with the "reference model" of [23], although the uncertainties in the latter case, stemming only from the imaginary ambiguities in the Borel integral, are usually smaller. In Fig. 5, we compare the two approaches to the Borel integral, for two exemplary moments, and show that they lead to very similar results. The Borel integral from the reference model of [12] is shown as a green band, with a horizontal offset with respect to the results from Padé approximants, in yellow. In both cases, the FOPT series is preferred.