ODU Digital Commons ODU Digital Commons Combining Nonperturbative Transverse Momentum Dependence Combining Nonperturbative Transverse Momentum Dependence With TMD Evolution With TMD Evolution

Central to understanding the nonperturbative, intrinsic partonic nature of hadron structure are the concepts of transverse momentum dependent (TMD) parton distribution and fragmentation functions. A TMD factorization approach to the phenomenology of semi-inclusive processes that includes evolution, higher orders, and matching to larger transverse momentum is ultimately necessary for reliably connecting with phenomenologically extracted nonperturbative structures, especially when widely different scales are involved. In this paper, we will address some of the difficulties that arise when phenomenological techniques that were originally designed for very high energy applications are extended to studies of hadron structures, and we will solidify the connection between standard high energy TMD implementations and the more intuitive, parton model based approaches to phenomenology that emphasize nonperturbative hadron structure. In the process, we will elaborate on differences between forward and backward TMD evolution, which in the context of this paper, we call “ bottom-up ” and “ top-down ” approaches, and we will explain the advantages of a bottom-up strategy. We will also emphasize and clarify the role of the integral relations that connect TMD and collinear correlation functions. We will show explicitly how they constrain the nonperturbative “ g -functions ” of standard Collins-Soper-Sterman implementations of TMD factorization. This paper is especially targeted toward phenomenologists and model builders who are interested in merging specific nonperturbative models and calculations (including lattice QCD) with TMD factorization at large Q . Our main result is a recipe for incorporating nonperturbative models into TMD factorization and for constraining their parameters in a way that matches perturbative QCD and evolution. DOI: 10.1103/PhysRevD.106.034002


I. INTRODUCTION
The techniques of transverse momentum dependent (TMD) factorization and evolution that are rooted in the Collins-Soper-Sterman (CSS) [1][2][3] approach are by now very widespread, but applications tend to take on superficially different forms depending on their use and the specific phenomenological context. This fact will be very important for understanding the goals of the present paper. To unify different TMD applications, and especially to compare the phenomenologically extracted nonperturbative transverse momentum dependence across many different processes, some translation is still necessary. For our purposes, we will need to consider at least the following two phenomenological settings where TMD factorization has often been applied in the past: (1) Very large Q (Type II): With treatments of the transverse momentum spectra of processes at very large hard scales (say Q ≳ 10 GeV), it is common to assume (sometimes implicitly) that transverse momentum dependence is describable mostly as a form of perturbative radiation in collinear factorization. An example is weak boson production in hadronhadron collisions, where the hard scale is fixed by a heavy boson mass, larger than 80 GeV. When the transverse momentum q T of the boson is of order the hard scale, calculations can follow a fixed order collinear factorization treatment. The challenge is then to describe the behavior as q T decreases relative to Q. Term by term in fixed order calculations there are logarithms such as with integer m, n > 0, that grow until they spoil truncated perturbation theory. With this as the starting point, the natural strategy is to try to resum as many such logarithms as possible. Most traditional TMD factorization techniques [1][2][3], along with soft-collinear effective theory (SCET)-based approaches [4][5][6], as well as approaches that directly resum transverse momentum logarithms [7][8][9][10][11], effectively account for these types of logarithms while also allowing for at least some contribution from nonperturbative transverse momentum in the q T ≈ Λ QCD region. (2) Hadron structure and moderate Q (Type I): TMD parton distribution functions (pdfs) and fragmentation functions (ffs) also feature prominently in studies whose focus is more directly on the nonperturbative structure of hadrons. In these types of applications, the relevant hard scales tend to be at much lower Q than in Type II situations, such as the Q≈ few GeVs common in many semi-inclusive deep inelastic scattering (SIDIS) measurements. It is possible to trace the origin of many of the hadron-structure oriented approaches in Type I applications to intuitive pictures of colliding hadrons in a parton model. The hadrons, in this view, are composed entirely of nonperturbative quark and gluon constituents [12][13][14][15], and the earliest versions of phenomenological Type I applications usually adopted the approximation that all transverse momentum dependence is nonperturbative in origin. As such, they could mostly ignore the role of perturbative tails at large q T and of evolution [16][17][18][19]. The Type I and Type II classification roughly follows that at Feynman, Field, and Fox ( [20] Fig. 6). At a formal level, it is now very well understood that approaches to Type I and Type II observables can be made equivalent. And, of course, there is no sharp distinction between what constitutes a Type I or a type II scenario. It is possible to merge treatments of hadron structure with the evolution formulas that were traditionally applied at much larger Q [3,[21][22][23][24], and indeed much activity over the past decade was devoted to implementing TMD evolution, in the context of hadron structure studies, in ways that include nonperturbative parts [25][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40]. (The versions of TMD factorization and evolution that we will focus on in this paper are those rooted in, or very similar to, the CSS formalism as described in Ref. [3]-see also Ref. [41] for translations to other approaches.) There are, nevertheless, some remaining open issues related to the interpretation of intrinsic nonperturbative transverse momentum dependence that is extracted pheomenologically and its role in cross section calculations, and these can have practical consequences. We will explain what, we mean here in much more detail in the main body of this paper. For now, we will prepare the reader by noting how some of the remaining complications originate in a clash between the natural phenomenological strategies that are often implicit in Type I and Type II situations.
Starting from a typical Type II perspective, the main issue is that large transverse momentum perturbation theory calculations (with very large Q) receive correction terms such as Eq. (1) that diverge as q T approaches zero. So, the natural strategy is to try to resum as many transverse momentum dependent logarithms as possible as q T decreases until one is essentially forced to incorporate a nonperturbative transverse momentum dependent component. We call this a "top-down" view because it starts by optimizing a large q T dependence at large q T ≈ Q in collinear perturbation theory, and then it extends it via evolution and resummation downward to more moderate Q and q T ≈ 0.
But, as an alternative strategy, one might instead start from the perspective more common in Type I scenarios. That is, one may begin by considering a moderate input scale Q 0 , low enough so that the accessible range of q T either is comparable to Q 0 and is perturbative in origin or is smaller than Q 0 and is mostly nonperturbative. 1 A TMD parton model type of description is the most natural and appropriate approach here. There is no region within the range of 0 < q T ≲ Q 0 where calculations involve large logarithms analogous to Eq. (1). Thus, for Q ≈ Q 0 , the need to resum them does not arise. The only task then is to match a nonperturbative parametrization of transverse momentum dependence to a fixed order q T ≈ Q 0 calculation of transverse momentum dependence in collinear factorization.
However, the diverging logarithms will reappear later if we evolve to large Q. In this case, the uncontrolled perturbation theory errors will reappear at large q T once, we consider Q ≫ Q 0 where the transverse momentum region q T ≫ Q 0 becomes accessible. The problematic logarithms are in the correlation functions that were originally defined at the input scale, and they take the form More specifically, there exists no significant transverse momentum region where m=q T and q T =Q 0 are both simultaneously small, though each is small in some region of q T . Note that it is possible for this to be the situation even if Q 0 is large enough that α s ðQ 0 Þ is small. Unlike with the Type II oriented approaches discussed previously, the large logarithms from the bottom-up perspective are due to q T getting large rather than small. We call it a "bottom-up" viewpoint because it starts by optimizing q T ∼ 0 treatments at moderate Q 0 , and then uses evolution equations to extend to large q T and large Q ≫ Q 0 where terms such as Eq. (2) need to be addressed. These aspects of a bottom-up style of approach will be made much clearer in our review of TMD factorization in Sec. II.
In a more general sense, within typical Type II oriented approaches, transverse momentum dependence is seen as primarily perturbative and requiring corrections from nonperturbative behavior, while in typical Type I oriented approaches it is seen as a primarily nonperturbative phenomenon that needs to be improved with perturbative corrections from large transverse momentum. In the former, divergences are seen as a low q T problem while in the latter they are a large q T problem.
There is no formal difference between the bottom-up and top-down viewpoints just discussed, and they are generally described by identical sets of factorization and evolution equations. The logarithms such as Eq. (2) are simply the mirror images of those in Eq. (1). However, they suggest different phenomenological strategies, and this can have practical consequences, as we will see. The focus of this paper will be to give a full account of a bottom-up approach, which to our knowledge has never been made entirely explicit. We will also formally link it back to topdown approaches.
As, we will see, adopting a bottom-up strategy brings with it some significant practical advantages in situations where extracting nonperturbative transverse momentum dependence is the primary goal. It also raises several important aspects of nonperturbative parametrizations that are often hidden to the foreground. All of this will be made clearer in the main body of the text.
One advantage is that, at the input scale Q 0 , it will be natural from the bottom-up perspective to categorize regions as perturbative or nonperturbative in transverse momentum space rather than in transverse coordinate space. This accords well with standard approaches to interpreting experimental data. For example, Ref. [42] offers some interpretations of recent COMPASS measurements with their Fig. 17 and the comment "… the two exponential functions in our parameterisation F 1 can be attributed to two completely different underlying physics mechanisms…". Much earlier, Feynman, Field, and Fox similarly identified different physical TMD mechanisms in transverse momentum space-see Figs. 6 and 7 and the surrounding discussion from Ref. [20]. Associated with observations such as these is an implicit choice to categorize physical mechanisms as perturbative or nonperturbative in momentum space.
The most prevalent models of nonperturbative TMDs were also constructed in momentum space, so they also sort regions into perturbative and nonperturbative categories in transverse momentum space. In many models, the large transverse momentum tails are suppressed on the grounds that it is only the small transverse momentum dependence that is nonperturbative or intrinsic to the hadron structure. Examples include at least spectator models [43][44][45][46], light-cone wave function descriptions [47][48][49][50][51], bag models [52][53][54][55][56], the Nambu-Jones Lasinio model [57][58][59], calculations based on the Dyson-Schwinger equations [60], classic quarkhadron model approaches, and many others [36,61,62]. As will become clear, it is easier to incorporate models such as these into full evolution treatments when starting from a bottom-up approach.
It will also turn out to be easier to match these models to perturbative transverse momentum calculations at large q T in transverse momentum space than in coordinate space. Generally, fixed order perturbation theory calculations in collinear factorization for large transverse momentum (q T ≈ Q) observables are performed in momentum space. When, we categorize behavior as perturbative or nonperturbative in TMD factorization according to momentum space regions, we will find that we are better able to ensure a consistent transition to the large-q T , collinear description where transverse momentum is generated perturbatively. (See also our comments below regarding the "asymptotic" term.) The difficulty of matching regions of q T with different physics has led to proposals to implement all Type II style resummation entirely in momentum space, such as [63,64], inspired by earlier pioneering work that resummed leading transverse momentum logarithms [65].
Another issue that will become clearer from a bottom-up perspective is the question of how to describe the transition between perturbative and nonperturbative transverse coordinate space dependence. Working in coordinate space has significant advantages for both setting up TMD factorization theoretically and implementing it. For example, most versions of the CSS formalism [1,2,66,67] are in coordinate space. In this paper, we will continue to use coordinate space for implementing evolution. But in most top-down approaches to phenomenology, which are almost always formulated in coordinate (b T ) space, it is only when b T is very large, usually above some specified transition point b max , that there is an explicit allowance for a nonperturbative transverse coordinate dependence. The strategy is often to push reliance on perturbation theory, with its well-understood collinear pdfs, normally considered valid only at small b T , as far into the large b T region as possible and thereby maximize the predictive power supplied by collinear factorization. However, unless the functions that parametrize the nonperturbative b T > b max region are constructed very carefully, there can be undesired side effects, particularly if the nonperturbative b T -dependent contribution becomes nonnegligible. In the original setup of TMD factorization, b max marks a totally arbitrary line of demarcation between what are labeled "large/nonperturbative" and "small/perturbative" transverse coordinate space regions. Formally, b max dependence vanishes exactly from any physical observables. (See, for example, our discussion in Sec. 13.10.4 of [3].) A setup such as this ensures that purely perturbative calculations are sequestered from dependence on nonperturbative parameters. However, in practical implementations there is an unsettling tendency for b max dependence to propagate into collinear perturbative calculations [68] and affect final results. Thus, purely perturbative calculations can appear to get contaminated by nonperturbative modeling. A significant b max dependence in a calculation is a symptom that something is likely nonoptimal about the underlying approximations being used. Part of the difficulty is that a valid description over the whole range of b T needs to interpolate between the perturbative and nonperturbative regions, and varying b max amounts to simply reshuffling contributions between factors. In many practical applications, however, there is a rather abrupt switch between a purely collinear perturbation-theory calculation and one that involves a nonperturbative ansatz. We will show how switching from a top-down to a bottom-up perspective clarifies the steps that are needed to minimize b max dependence in the parametrizations used in phenomenological applications. Indeed, we will find that we are able to avoid the usual b Ã prescription entirely, along with a b max , and rely instead only on the input hard scale Q 0 to demarcate any transition between large and small transverse momenta.
Part of the work that is necessary in setting up reasonable phenomenological TMD parametrizations involves ensuring that they retain at least an approximate hadron structure interpretation. For example, the probability density interpretation of an unpolarized quark TMD pdf f i=p ðx; k T Þ gives where i is the flavor of a quark in hadron p and f i=p ðxÞ is the corresponding collinear pdf. We use the "≈" symbol here because the actual transverse momentum integral in Eq. (3) is ultraviolet (UV) divergent in QCD and needs to be regulated. After accounting for the UV divergence (by renormalizing the operator definition of the pdf, for example, and introducing regulators), the equality only holds up to order α s and power-suppressed corrections.
The role that identities such as Eq. (3) play in constraining nonperturbative transverse momentum dependence is often unclear in implementations of the CSS and related formalisms. The main reason is that the relation seems to be satisfied automatically in the way that final CSS cross section formulas are usually presented. To see how this happens, first consider the TMD pdf expressed in terms of its transverse coordinate space version,f i=p ðx; b T Þ, The UV divergence in Eq. (3) manifests itself in Eq. (4) as a divergence inf i=p ðx; b T Þ as b T → 0, so it may be regulated by freezing the b T argument inf i=p ðx; b T Þ below some very small b min . In that case, the integral of Eq.
In the next step, one observes that an operator product expansion (OPE) applies tof i=p ðx; b min Þ in the limit of small b min and relates it to the collinear pdf f i ðxÞ, The C i=j is a perturbative hard coefficient, C i=j ⊗ f j=p is the usual convolution integral of collinear factorization, the sum is over parton flavors, and "p.s." refers to power suppressed corrections. The perturbation theory expansion on the right-hand side is optimized by choosing renormalization scales appropriately in relation to b min . So, dropping small corrections from the right-hand side of Eq. (5)/Eq. (6) [69] ] of the evolved CSS cross sections amplify that impression because they almost always include the transition to the OPE in Eq. (6) at small b T automatically in the final cross section expressions. From Eq. (6) above, it then appears that any nonperturbative modeling of large b T behavior can only impact the right-hand side of Eq. (3) through power suppressed terms that are mostly irrelevant.
However, Eq. (6) is itself a restatement of Eq. (3), not a derivation of it. That is, a derivation of Eq. (6) requires that one identify and factorize integrals over the small, nonperturbative region of parton transverse momentum (i.e., the core contribution to the TMD pdf) and identify them with contributions to a collinear pdf, just as in Eq. (3). Therefore, a derivation of Eq. (3) that refers back to Eq. (6) as its starting point is circular. Using Eq. (6) assumes that a version of Eq. (3) already holds nonperturbatively.
There are practical consequences of this for setting up nonperturbative parametrizations of TMD correlation functions. One discards an important and useful constraint from the hadron structure interpretation of the nonperturbative f i=p ðx; k T Þ if one assumes that Eq. (3) holds automatically on the basis of the OPE alone. By doing so, one is effectively assuming that, from the outset, all important regions of the integrand on the left-hand side of Eq. (3) are at such large k T that they are expressible entirely in terms of collinear pdfs, and that the intrinsic transverse momentum is completely irrelevant insofar as the integral is concerned. But there are many practical situations in TMD physics where that is obviously not the case. For example, at moderate scales where f i=p ðx; k T Þ can be successfully modeled phenomenologically by a Gaussian distribution, with no collinear perturbative tail present at all, the entire range of the integration is described by a nonperturbative model. Thus, it is a requirement that Eq. (3) be imposed explicitly as a constraint from the outset if the goal is to maintain a connection between the nonperturbative TMD parametrizations of hadron structure interpretations and the corresponding collinear functions.
Many past TMD parton model approaches to Type I phenomenology do, of course, impose a version of Eq. (3) on TMD pdfs, but they typically do not include the transition to perturbative transverse momentum at large transverse momentum [or small b T as in Eq. (6)]. Gaussian fits in momentum space are the most prominent examples. The reverse limitation is true of many CSS and similar approaches; these approaches account appropriately for large transverse momentum behavior, but they rarely (if ever) impose conditions analogous to Eq. (3) directly on the models of nonperturbative transverse momentum dependence. We will show in this paper how adopting a bottom-up perspective makes it clearer how to do both simultaneously, and we will give special attention to integral relations such as Eq. (3) and expand on the discussions above.
It is worth noting that similar issues also appear in other formalisms that deal with parton transverse momentum, such as the small-x oriented unintegrated pdfs of the Kimber-Martin-Ryskin-Watt [70,71] treatments [72][73][74].
The issues that are clarified by a bottom-up perspective are interrelated, and the two just discussed will turn out to be examples of that. Earlier, we mentioned how failing to interpolate explicitly between nonperturbative and perturbative transverse momentum dependence can lead to undesired b max dependence in perturbative calculations. But, as we will see, setting up that interpolation involves the use of relations such as Eq. (3). Conversely, imposing a version of Eq. (3) on small transverse momentum descriptions, in a way that accounts for a cutoff around k T ≈ Q 0 , naturally requires a careful treatment of the large k T ≈ Q 0 region, and thus the interpolation to perturbatively large k T becomes relevant.
Very broadly speaking, switching to a bottom-up approach helps with the inverse problem that arises from transforming between transverse momentum and coordinate space (see also [75]). Performing exact Fourier transformations to momentum space requires exact knowledge about the full range of the transverse coordinate space correlation functions. But actual measurements only probe specific limited ranges of b T . Very large Q measurements are only sensitive to small transverse sizes, and so they poorly constrain the nonperturbative transverse coordinate dependence at very large b T . As a consequence, fits evolved from large Q down to moderate Q can be unstable. For instance, the authors of Refs. [76,77] pointed out that the nonperturbative evolution extracted from the earlier CSS fits, which were frequently dominated by very large Q data, is too rapid when it is used to evolve down to the Q of only a few GeVs that are relevant for hadron structure studies. Conversely, measurements done at a more moderate Q ≈ Q 0 might be dominated by nonperturbatively large b T (or small k T ), but they are usually insensitive to the TMD correlation functions' very small b T ≪ 1=Q 0 (or large k T ≫ Q 0 ) behavior. The latter, however, do not generally need to be extracted but are instead calculable in terms of collinear correlation functions. So, a more stable phenomenological strategy is to first emphasize the role of more moderate Q measurements for constraining input nonperturbative transverse momentum and then rely on perturbation theory to evolve to larger Q.
Another aspect of TMD phenomenology that will be easier to control in a bottom-up approach is the matching to a momentum space, large transverse momentum asymptotic term. That step is important for relating the small q T ≪ Q treatment supplied by TMD factorization to the standard momentum space q T ≈ Q treatments in collinear factorization. Agreement between the two types of calculations at large q T provides an important consistency test. In practice, however, ensuring that the asymptotic term and the TMD factorization calculations match can be difficult at moderate Q. That is especially the case near the lowest scales, around Q 0 ≈ 1-2 GeV, that are conventionally accepted as reasonable input scales. The difficulties can be largely traced back to those that we have discussed so far in this Introduction. Errors that grow as Q evolves downward propagate, at least to some degree, to all values of transverse momentum, including the large transverse momentum. The range of accessible q T shrinks, so there is a less well-defined separation between different transverse momentum regions.
We have so far emphasized a dichotomy between bottom-up/top-down approaches to phenomenology to highlight steps that continue to be a challenge when merging nonperturbative TMD parton model descriptions with TMD factorization and evolution in phenomenology, and to provide readers a basic foundation from which to read the rest of this paper. We reassure readers who find the discussion to be rather vague so far that it likely will become much clearer in the main body of this paper; some of the points noted above require explicit examples in order to be made entirely clear. We will provide some such examples within the main text.
Another way to understand the goals of this paper is to recall that nonperturbative transverse momentum dependence enters the standard CSS implementations in the form of extra coordinate space factors, often expressed as exponential factors e −g j=A ðx;b T Þ for a flavor j in hadron A  [69].] Our purpose is to explain, in much more detail than is usual, how to optimally parametrize these functions in phenomenological applications, such that they match to existing models or calculations of intrinsic nonperturbative transverse momentum dependence. Nothing about the standard TMD factorization formalism itself will change. The final outcome will be a recipe for merging arbitrary nonperturbative models of g j=A ðx; b T Þ and g K ðb T Þ with TMD evolution in the CSS style. To our knowledge, this is the first instance of such a discussion. We will connect the bottom-up strategy with more standard presentations that involve g j=A ðx; b T Þ and g K ðb T Þ in Sec. IX. Some of our goals overlap with those recently addressed in [75], which introduced techniques for disentangling large and small b T contributions to cross sections, and these techniques will likely be helpful in future efforts to identify and separate perturbative and nonperturbative contributions.
To further prepare the reader, we end this Introduction by listing the main aspects of what we will advocate that differ from what is sometimes done: (i) When, we discuss phenomenology at an input scale Q 0 , we will work mainly in transverse momentum space rather than in coordinate space. (ii) We will nevertheless account for the perturbative behavior at q T ≫ Q 0 in TMD correlation functions. We will do this even for Q ≈ Q 0 where such contributions are often phenomenologically irrelevant. (iii) For the region where Q ≈ Q 0 , q T ≈ Q 0 , all renormalization scales will be fixed at order Q 0 and not 1=b T since there is no need for resummation of logarithms of the type lnðq T =Q 0 Þ or lnðb T Q 0 Þ. (iv) We will nevertheless perform evolution to Q ≫ Q 0 in coordinate space, and switch to a ∼1=b T scale, but only at a later step. (v) Also for Q ≈ Q 0 , we will explicitly impose approximate matching to the fixed order asymptotic behavior at q T ≈ Q 0 . (vi) In nonperturbative parametrizations of TMD correlation functions at Q ¼ Q 0 , we will explicitly impose a version of Eq. (3). We will do this in momentum space with a large transverse momentum cutoff to regulate UV divergences and to match with what is typically done in phenomenological models. (vii) Our approach will include an explicit interpolation between purely nonperturbative and purely perturbative descriptions of transverse momentum dependence in the TMD correlation functions. Along the way, we will highlight the advantages of these choices by using explicit examples. At the end, we will translate the expressions into forms that are more familiar from standard TMD factorization implementations in the CSS formalism.
We will start by reminding the reader of the basic setup of TMD factorization and evolution in Sec. II. Section III discusses the role of integral constraints for nonperturbative parametrizations of TMD correlation functions in more detail. Sections IV and V will focus on the details of modeling nonperturbative parts of TMD functions at an input scale Q 0 and in momentum space. The steps for using input parametrizations to evolve to higher scales are summarized in a practical phenomenological recipe in Sec. VI. In Sec. VII, we use concrete toy examples to illustrate the steps, including plots. In Sec. VIII, we return the integral relation and discuss it in light of the bottom-up approach. In Sec. IX, we explain how to connect the bottom-up approach to standard CSS expressions. We offer our concluding remarks in Sec. X.
Explaining some intermediate steps will require a more pedantic system of notation than what is normally necessary. To help the reader keep track of symbols and conventions, we have therefore included a notation glossary in Appendix A.

II. TMD FACTORIZATION AND EVOLUTION
We begin by reviewing some of the basic setup of TMD factorization to establish context and introduce notation for later sections. While all of what we discuss is meant to apply to any of the basic processes for which there are TMD factorization theorems, it will be instructive to work within a specific example. For this, we will use semi-inclusive annihilation (SIA) of a lepton-antilepton pair (usually electron-positron) into a pair of nearly backto-back hadrons with a sum over all other final state particles X, A quark-antiquark pair is produced in the hard vertex, and hadrons H A and H B are measured in the final state. This is among the simplest processes to work with theoretically, and it is ideal for illustrating the basics of TMD factorization. (See, for example, the discussion in Chapter 13 of Ref. [3].) The process is illustrated graphically in Fig. 1: An electron (l) and a positron (l) annihilate to create a virtual photon of momenta q, which creates a quark-antiquark pair. The two hadrons measured in the final state with momenta p A and p B are then produced when partons A and B fragment. The momentum of the virtual photon sets the hard scale of the process Q, with q 2 ≡ Q 2 . See also Refs. [78][79][80] for more details about the general kinematical setup.
In a reference frame where the hadrons are back-to-back, the transverse momentum of the photon q T is the relevant observed final state transverse momentum. When it is small relative to the hard scale, q T ≪ Q, it is sensitive to intrinsic transverse momenta of the hadronizing quark and antiquark, respectively. The usual Lorentz invariant kinematic variables related to collinear momentum fractions are where the "≈" means, we drop terms that are power suppressed in the current fragmentation region (by which, we mean z A and z B are fixed and not too small relative to 1).
The "h" subscripts on light cone momentum components indicate that they are with respect to the hadron frame.
In TMD factorization, the unpolarized cross section differential in z A , z B , and q 2 T is written [3] The second line has the familiar form from the TMD parton model, but with extra auxiliary arguments for evolution. The capital D j=A and D¯j =B are the TMD ffs for a quark of flavor j (j) to fragment into hadron A (B). A sum over flavors is implied. In addition to the longitudinal and transverse parton momentum arguments z A;B and k A;BT , the TMD ffs also depend on a renormalization group scale μ and a rapidity evolution scale ζ, which in Eq. (9), we have already fixed equal to μ ¼ μ Q ≡ C 2 Q and ζ ¼ Q 2 to optimize perturbation theory. Here, C 2 is an arbitrary numerical constant of order unity. (Throughout this paper, we will assume C 2 ¼ 1.) Hðμ Q ; C 2 Þ is a hard factor of the form H ¼ 1 þ Oðα s ðμ Q ÞÞ, up to an uninteresting overall constant. The Yðq T ; QÞ term on the last line is an abbreviation for the correction needed for the q T ≈ Q behavior, and it is calculable entirely in fixed order collinear factorization. The second line in Eq. (14) is exactly the TMD parton model familiar from typical Type I applications if we drop the auxiliary μ and ζ arguments and set Hðμ Q ; C 2 Þ ¼ 1.
We will focus on a very specific combination of physical observables in order to simplify later illustrative examples. Say that hadron A is h þ and hadron B is its antiparticle h − . Then we can consider the combination We will also consider only the j ¼ "up quark" contribution to Eq. (9). Then, summing the corresponding terms on the right-hand side of Eq. (9) gives Then we can define nonsinglet TMD fragmentation functions And, we can drop the j index for the rest of this paper and rewrite Eq. (9) in a more abbreviated way as Our results are general and independent of the specific hadrons in the final state, but organizing the discussion around this channel will simplify illustrative example calculations later on by allowing us to drop explicit flavor indices and consider only nonsinglet ffs in parts of calculations that involve collinear DGLAP evolution. Specifically, in our examples we will use the JAM20 collinear fragmentation functions for π þ and its corresponding grid for α s values [81]. Note that because of charge conjugation, to construct the TMDs in Eqs. (12) and  (13), only the nonsinglet combination u −ū is involved. We use the set ¼ 0, from the LHAPDF library [82], since we will not be making any comparison to data. All that is needed for our present purposes is that the collinear ffs employed in our numerical examples obey nonsinglet DGLAP evolution equations. Since our focus for this paper is on the q T ≪ Q region at leading power, where TMD correlation functions are relevant, we will also drop any mention of the Yðq T ; QÞ term from here forward.
The second line in Eq. (14) is exactly the TMD parton model familiar from typical Type I applications if we drop the auxiliary μ and ζ arguments and set Hðμ Q ; C 2 Þ ¼ 1. This term is sometimes called the W term.
It can be convenient to write the W term in terms of coordinate space TMD ffs, where the transverse coordinate and momentum space TMD ffs are related to one another via The TMD ffs D A and D B are hadron-vacuum correlation functions defined in a very particular way in terms of quark field operators. An explanation of these definitions and their origins in factorization theorems would involve topics such as Wilson lines, rapidity divergences, soft factors, and other related issues that are beyond the scope of this article. In recent years, they have been reviewed in many places, so to avoid repetition we refer the reader to [83], which includes an overview and a list of references associated with TMD definitions derived in the style relevant for this article, and also to [84], which includes additional relevant references.
With regard to the TMD definitions, what is important for our purposes is only that they satisfy an exact set of evolution equations for the auxiliary variables μ and ζ. For a TMD ff in coordinate space, the evolution equations are The evolution kernels, γ, γ K , andK, are known by many different names in the literature. In keeping with our earlier work, we will refer toKðb T ; μÞ as the Collins-Soper (CS) kernel.
For large enough μ, the anomalous dimensions γ K and γ are both calculable in perturbation theory, and they are independent of b T .Kðb T ; μÞ is also perturbatively calculable at small b T ∼ 1=μ, but it becomes nonperturbative over large distances, b T → ∞. However, since it is independent of the identity of external hadrons, it has very strong universality properties related to the QCD vacuum.
The evolution equations are first order linear differential equations that relate the TMD ffs in Eq. (15) at an input scale Q 0 to a different scale Q, so they can easily be solved exactly and analytically. The general solutions to the evolution equations for D A and D B , substituted into Eq. (15), allow us to write Wðq T ; QÞ as We could express the same evolved W with TMD ffs in transverse momentum space, and in some cases that might help with intuition since it more closely matches the TMD parton model. But then the Fourier transforms of simple factors become transverse momentum convolution integrals of several functions. For the purposes of implementing evolution, we will continue to work with Eq. (20) in coordinate space, as is normally done. When Q ¼ Q 0 , Eq. (20) exactly reproduces the TMD parton model form of the factorization, So far, Eqs. (20) and (21) still involve no approximations at all on the W term. However, approximations are ultimately necessary, of course, for getting practical results. Rather different kinds of approximations to Eq. (20) can enter in a number of different ways, so the language can become confusing. We will be as precise as possible with our wording here. There are at least four different types of approximation that generally take place simultaneously, including (1) The choices of models, assumptions, or approximations used to describe nonperturbative contributions; (2) The neglect of power suppressed corrections to factorization, such as the last term in Eq. (14); (3) Truncation of high powers of α s in perturbative parts of the calculation; and (4) In phenomenological applications involving fits, whatever assumptions and approximations are used at the level of fit extractions. When, we discuss an "nth-order" perturbative treatment, we will mean by this that all parts that are calculable in fixed order perturbative QCD have been included through order n, and they are optimized using the renormalization group and Collins-Soper evolution [Eq. (17)]. These "perturbative parts" include the right-hand sides of Eqs. (17)- (19), which in Eq. (20) appear as γ K ðα s ðμ 0 ÞÞ, γðα s ðμ 0 Þ; 1Þ, andKðb T ; μ Q 0 Þ when b T (or k T ) is small (large) enough that its b T dependence is perturbative. In Eq. (20) it also includes Hðμ Q ; C 2 Þ and theD A ,D B functions in regions of small b T (or large k T ). An "(n)" superscript on a function means that it has been replaced by its truncated, fixed order perturbation theory calculation through order n. When, we discuss nonperturbative parts in later sections, it should be understood to be in the context of phenomenological extractions. We will assume for the sake of our discussion here that all nonperturbative parametrizations have been made flexible enough that the significant errors come only from the limitations of factorization and truncated perturbation theory, and not from a poor choice of nonperturbative parametrizations or artifacts of the fitting procedure. Thus, a function such asD ðnÞ A should be read as "aD A parametrization including nonperturbative parts extracted from measurements and using an nthorder perturbation theory treatment for its perturbative ingredients." Also, when we use the phrase "nonperturbative part" (e.g., the g functions of the CSS formalism), it should generally be understood that we are not necessarily referring to parts of a calculation that cannot ever be improved with small coupling techniques. It only refers to contributions that we choose to exclude from those factors that we explicitly identify as perturbative.
Now consider how one might use Eq. (20) to do phenomenology in a Type I scenario and from a bottomup perspective. Near the input scale, Q ≈ Q 0 , the evolution factor on the second line is nearly unity, and we can, to a good approximation, just work with Eq. (21). If Q 0 is only of order ∼1-2 GeV, then a phenomenological measurement of Wðq T ; Q 0 Þ is only sensitive to the region of nonperturbatively large b T parts ofD A andD B , whereas the b T ≪ 1=Q 0 contributions are strongly suppressed in cross sections. Therefore, measurements of the Q ≈ Q 0 region place rigid constraints on the nonperturbative transverse momentum dependence while remaining mostly insensitive to perturbative tails from large k A;BT (or small b T ). Indeed, past phenomenology with Eq. (21) has confirmed that the basic TMD parton model is quite successful at describing moderate Q data with entirely nonperturbative k T parametrizations such as Gaussians [18,[85][86][87]. Work that is done in this style effectively replaces the exact Eq. (21) with the approximation where the "np" subscripts on the TMD ffs refer to Fourier-Bessel transforms of entirely nonperturbative model parametrizations for , Gaussians), with no matching to perturbative large-k T tails. If most of the 0 ≲ q T ≲ Q 0 region of transverse momentum dependence is nonperturbative at Q ≈ Q 0 , then there is nothing more to do at this scale, and replacing Hðμ Q 0 ; C 2 Þ by H ðnÞ ðμ Q 0 ; C 2 Þ in Eq. (22) is the maximum amount of perturbative input that is possible.
Notice that, if we have perfectly constrained the input TMD ffs for all k T at the input scale Q 0 , then evolving them to larger Q ≫ Q 0 with Eq. (20) is almost trivial. All that is necessary are γ K ðα s ðμ 0 ÞÞ, γðα s ðμ 0 Þ; 1Þ, andKðb T ; μ Q 0 Þ calculated to nth order in perturbation theory and a way to perform the inverse Fourier-Bessel transform. Thus, traditional Type I extractions of TMD ffs near an input scale might appear at first sight to be all the input that is necessary, aside from a treatment of the evolution kernels, for evolution to any other scale.
However, this will fail to work if applied directly to traditional Type I oriented styles of phenomenological extractions that neglect large k T tails. The reason is that Þ need to be well-described over their entire k T ranges, not just over the 0 ≲ k T ≲ Q 0 regions that moderate Q 0 measurements effectively probe, in order for the evolution to larger Q to be accurate. Fortunately, the large-k T behavior of TMD ffs is calculable in collinear factorization, so nothing prevents us from simply interpolating between phenomenological fits of the small k T region and a large-k T perturbative description of the input TMD correlation functions at k T > Q 0 .
We will discuss how to make this adjustment to standard Type II oriented approaches in later sections.
If Q 0 is, as we intend, a lower bound on what is considered a reasonable hard scale, near the boundary between large and small coupling, then there is a very rapid scale dependence when a choice of μ ∝ Q 0 is varied downward. Therefore, it is preferable to keep renormalization and rapidity scales fixed around μ Q 0 and Q 0 in the moderate k T ≈ Q 0 or b T ≈ 1=Q 0 regions. (We elaborate on this slightly subtle point in Appendix B to avoid breaking the flow of our main discussion here.) But in the k T ≫ Q 0 region, large logarithms of the form lnðk T =Q 0 Þ start to degrade the accuracy of such a treatment. Extending calculations to the very large k T ≫ Q 0 region requires another iteration of the evolution equations to switch scales. (Actually, when we do this in later sections, it will be in coordinate space as is more common, and we will switch to the usual ∼1=b T scale.) We will explain the details of both the interpolation and the scale transformation in Sec. V. The discussion of Eq. (2) in the Introduction was in reference to these types of logarithmic errors.
For readers accustomed to more standard presentations of the CSS formalism, the above discussion might seem like a slightly odd starting point because normally one does not stop with Eq. (20). Rather, one immediately performs the step Þ to scales of order 1=b T to optimize collinear factorization calculations with the OPE in the region of b T ∼ 1=Q. The resulting expressions [see, for example, Eq. (22) of [69] ] highlight the role of the b T ∼ 1=Q calculation. Ultimately, we will perform an equivalent step, but we choose to delay it for now for reasons that we hope will become clear later. The connection between Eq. (20) and the usual CSS presentation will be made in Sec. IX.

III. INTEGRAL RELATIONS
As we will see in later sections, consistently interpolating between nonperturbatively small k T and perturbatively large k T will require integral relations such as Eq. (3). In this section, we will define and discuss a collinear ff that obeys an analogous equation with a cutoff on large transverse momentum. This section also introduces additional notation that will be necessary in later sections. We remind the reader of our notation glossary, Appendix A.
We will use a superscript "ðn; d r Þ" on a TMD ff to indicate that it is calculated in a large (k T ≈ Q) transverse momentum approximation. Thus, As usual, "n" is the order of collinear perturbation theory. Thus, D ðn;d r Þ ðz; zk T ; μ Q ; Q 2 Þ is calculated through order α n s , with powers of α nþ1 s and ∼m=k T errors neglected. However, now we have also included a "d r " in the superscript. This is to indicate that the collinear factorization calculation uses a renormalized collinear ff d r ðz; μ Q Þ. The subscript r on d r in turn labels the UV renormalization or regularization scheme (such as, for example, r ¼ MS renormalization). We also define The m=k T in the error term of Eq. (23) symbolizes contributions that are power suppressed when k T ≈ Q. Throughout this paper, an "m" will always represent any generic mass scale that is of order a small hadronic size such as Λ QCD or an intrinsic transverse momentum. Also, to simplify notation, any power-suppressed contributions of the form ðm=QÞ β or ðm=k T Þ β , with β > 0, will always simply be written as Oðm=QÞ or Oðm=k T Þ, regardless of the power β.
To summarize, the symbol D ðn;d r Þ ðz; zk T ; μ Q ; Q 2 Þ is the approximation to an individual TMD ff wherein it is calculated in fixed order collinear perturbation theory, optimized to the region k T ≈ Q and Q → ∞, and using d r collinear fragmentation functions. The fixed order perturbative expression for D ðn;d r Þ ðz; zk T ; μ Q ; Q 2 Þ in collinear factorization has the form The "⊗" here symbolizes the usual collinear convolution integral In Eq. (25), C ðnÞ D ðzk T Þ is a hard coefficient. We have written its zk T argument explicitly as a reminder that this particular hard factor has k T dependence. Approximations to Dðz; zk T ; μ Q ; Q 2 Þ appropriate to regions other than k T ≈ Q will be left unaddressed for now. They will be discussed in Sec. V.
The TMD ff is related to a collinear ff by an integral over transverse momentum, with the details of the notation to be explained below. In a literal probability density interpretation, μ Q would be set equal to infinity and the second two terms on the right-hand side would be zero. In a renormalizable theory such as QCD, the integral needs to be regulated, and corrections are necessary to relate the cutoff integral to collinear ffs defined in standard schemes. The Δ ðn;d r Þ ðα s ðμ Q ÞÞ term on the righthand side of Eq. (27) is our notation for the perturbative correction through nth order that relates the cutoff integral to the collinear ff d r ðz; μ Q Þ in scheme r. There are also, in general, power-suppressed corrections, as indicated by the error term in Eq. (27). The correction term Δ ðn;d r Þ ðα s ðμ Q ÞÞ is related to collinear ffs via another factorization theorem, and C ðnÞ Δ is an order-α s ðμ Q Þ n hard coefficient, with a Δ subscript included here to distinguish it from the k Tdependent hard coefficient in Eq. (25). The "ðn; d r Þ" superscript in Δ ðn;d r Þ ðα s ðμ Q ÞÞ is to symbolize that Eq. (28) is to be calculated through order α s ðμ Q Þ n , and that the collinear ff is defined in the r renormalization and/ or regularization scheme. Note that Δ ðn;d r Þ ðα s ðμ Q ÞÞ is also a function of z and Q, but we have dropped explicit dependence on those variables to maintain as compact a notation as possible. We also define To make the above more explicit, let us define a new collinear ff that is the transverse momentum integral of a TMD ff regulated with a cutoff on all k T > μ Q : The r ¼ c subscript on the left indicates that this is an ff defined in the "cutoff" scheme. 2 Equation (30) is just the left-hand side of Eq. (27). Dropping the power-suppressed and order-α nþ1 s ðμ Q Þ terms on the right-hand side of Eq. (27) gives an equation that is satisfied only approximately. To give this a notation, we also define If the scheme for dealing with UV transverse momentum divergences is the cutoff scheme itself, r ¼ c, then exactly by definition. If the scheme is modified minimal subtraction, r ¼ MS, then where we have used the standard splitting function notation The quantities in Eqs. (23)-(32) are so far intended to be exact. That is, Dðz; zk T ; μ Q ; Q 2 Þ, d r ðz; μ Q Þ, etc., are the operator definitions.
Equation (30) is a version of Eq. (3) wherein the collinear ff d c ðz; μ Q Þ is defined such that the integral relation is satisfied exactly when the UV divergent integral is regulated with a cutoff. A collinear pdf or ff defined in this way does have some practical advantages. It is a natural definition when regions are categorized as perturbative or nonperturbative in transverse momentum space, and it is the preferred definition in some work associated with hadron structure (see, for example, the work of Brodsky and collaborators [88]). It is also a natural definition for matching to fixed order large-q T calculations, which, of course, are almost always performed in momentum space. However, it also comes with significant disadvantages as well; see, for example, Ref. [89] for a discussion of the different ways of handling UV divergences in definitions of collinear pdfs and the pitfalls of each. An important point is that the correction terms relating cutoff scheme and renormalized pdfs are not just perturbative, but also include the power-suppressed errors in Eq. (27).

IV. THE COLLINS-SOPER KERNEL NEAR THE INPUT SCALE Q 0
Up to now, our discussion has mainly focused on establishing context and defining notation. In the bulleted list of Sec. I, we explained that we will explicitly interpolate between nonperturbatively small k T and perturbatively large k T in our parametrizations of TMD functions at an input scale. With the language and notation of Secs. II and III, we are ready to step through the interpolation details.
Inspection of Eq. (20) shows that there are actually two separate types of functions in which we will need to interpolate between nonperturbative parametrizations and perturbative regions of transverse momentum when we do calculations for phenomenology. First, there are the TMD ffs themselves, also at the input scale Q 0 , which will be necessary for evolving away from Q ¼ Q 0 .
Between these two, it is actually the CS kernel that is the simpler to handle, so we will begin by focusing on it. In fact, the steps for interpolating a full parametrization of an input TMD ff Dðz; zk T ; μ Q 0 ; Q 2 0 Þ between large and small k T will turn out to be very analogous to what we do for the CS kernel.
It is straightforward to calculate Kðk T ; μ Q 0 Þ in fixed order perturbative theory if k T ≈ Q 0 . Using notation analogous to that of Sec. III, we express this as K ðnÞ ðk T ; μ Q 0 Þ is our notation for the fixed order perturbative calculation through order n and with all nonperturbative effects ignored. It takes the form where κ is a function of α s ðμ Q 0 Þ and logarithms involving Away from k T ≈ μ Q 0 , the accuracy of K ðnÞ ðk T ; μ Q 0 Þ as an approximation to Kðk T ; μ Q 0 Þ degrades. The reason for k T < μ Q 0 is simply that the k T dependence of the zero momentum soft radiation is nonperturbative. Conversely, the reason for the approximation failing at k T ≫ Q 0 is that the error terms in Eq. (36) include logarithms of k T =μ Q 0 that diverge in the large k T =μ Q 0 limit.
The k T ≫ Q 0 contribution is numerically unimportant in Eq. (21) or Eq. (22) because the region of transverse momenta far above Q 0 is suppressed at the input scale (for reasonable fixed values of z A and z B ). With regard to Q ≈ Q 0 cross sections, therefore, a suitable parametrization of Kðk T ; μ Q 0 Þ only requires that we deal with the region 0 < k T ≲ Q 0 . That is to say, near the input scale we only need to extend the perturbative k T ≈ Q 0 treatment in K ðnÞ ðk T ; μ Q 0 Þ downward into the nonperturbative k T < μ Q 0 region in order to have a parametrization of Kðk T ; μ Q 0 Þ that is useful for all k T accessible at the input scale. Let us therefore start by defining an "input" parametrization for K ðnÞ ðk T ; μ Q 0 Þ that equals the fixed order calculation, Eq. (38), when k T ≈ μ Q 0 or larger, but interpolates to a nonperturbative parametrization Now the (n) superscript on the left-hand side of this equation refers to the perturbative order of the large k T tail in this input nonperturbative parametrization. When we work with Eq. (20), we will need its coordinate space version of the CS kernel, The scale dependence of the exactK is exactly b T independent by the renormalization group (RG) equation Eq. (18), so we will enforce the condition that an nth-order parametrization satisfies Eq. (18) to order α s ðμÞ n , with only Oðα s ðμÞ nþ1 Þ errors, In Sec. VII A, we will provide an example of a specific trial functional form for Eq. (39). In general, however, any phenomenologically successful parametrization that satisfies Eqs. (39) and (41) is allowed. The parametrizations in Eqs. (39) and (40) are appropriate specifically when Q ≈ Q 0 such that only the region of 0 < k T ≲ Q 0 is important.
However, it is a poor approximation to the truẽ Kðb T ; μ Q 0 Þ in the k T ≫ Q 0 region, and this matters if we evolve to large enough Q for contributions from k T ≫ Q 0 to become significant. In Eq. (40), the large errors manifest themselves as higher order terms logarithmic in b T μ Q 0 , which diverge in the b T → 0 limit. There needs to be a change in renormalization scale. Thus, in coordinate space the more common choice for the RG scale is μ ¼ C 1 =b T , with C 1 being an order unity proportionality constant. The truncated RG improved perturbation theory then increases in accuracy as b T → 0.
To obtain a Kðk T ; μ Q 0 Þ parametrization that works well for all Q, we need steps that combine the stability of fixed scale calculations in the Q ≈ Q 0 , k T ≈ Q 0 region with the RG-improved calculations that optimize for the b T → 0 limit. Specifically, we need to perform a scale transformation on the above parametrization using the RG equation at a b T somewhat below 1=Q 0 . If we implement this scale transformation at small enough b T , it will have a negligible effect on phenomenology that uses the above parametrization near Q ≈ Q 0 where the b T ≪ 1=Q 0 is strongly suppressed. Therefore, fits that use Eq. (39) will be largely unaffected. And, if the transformation takes place in a range of b T at least comparable to ≲1=Q 0 , then its overall effect will only appear at order n þ 1 or higher, so the effect of the transformation will always be one order higher in perturbation theory than the working order. So, the transformation will ensure an accurate treatment of evolution to large Q in any subsequent steps. We will show how this works in detail below.
The first step in implementing the scale transformation is to define a b T -dependent mass scale, which we will call Q 0 ðb T Þ, that smoothly transitions between Q 0 and a 1=b T dependence in the region just below b T ≈ 1=Q 0 . Specifically, where C 1 is an order unity numerical constant, typically taken to be the scales Q 0 and C 1 =b T are numerically similar, so any sensitivity to the difference between the two scales is a higher order effect that can be reduced by including higher orders in perturbation theory. Therefore, the exact form of Q 0 ðb T Þ is arbitrary so long as it provides a reasonably smooth interpolation between the Q 0 and C 1 =b T behavior at large and small b T . Some example suggestions for Q 0 ðb T Þ, which we will call the transformation function, are shown in Appendix C. Next, we need to combine this with the RG equation, Eq. (18), whose exact solution is Here, μ i is an arbitrary initial scale. To make it useful in applications of Eq. (20), let us evolve from an initial scale μ i ¼ μQ 0 (where μQ 0 ¼ C 2Q0 ) so that the right-hand side containsKðb T ; μQ 0 Þ: For any μ ≈ Q 0 , the second term in Eq. (44) is calculable in perturbation theory with the nth-order anomalous dimension, γ K ðα s ðμ 0 ÞÞ → γ ðnÞ K ðα s ðμ 0 ÞÞ, and it vanishes for The original parametrization in Eq. (40) was designed to provide an accurate perturbative description ofKðb T ; μ Q 0 Þ in the region of b T ≈ 1=Q 0 and larger. Now if we replace the first term on the right-hand side of Eq. (44) with K ðnÞ input ðb T ; μQ 0 Þ, it continues to describe the b T ≳ 1=Q 0 region, by our construction ofQ 0 ðb T Þ. However, now the RG-improved perturbative contribution toKðb T ; μQ 0 Þ also remains accurate into the b T ≪ 1=Q 0 region. Therefore, we obtain an optimal parametrization by replacing the exactKðb T ; μQ 0 Þ on the right-hand side of Eq. (44) by the approximateK ðnÞ input ðb T ; μQ 0 Þ and the exact γ K ðα s ðμ 0 ÞÞ by γ ðnÞ K ðα s ðμ 0 ÞÞ. We define this asK The underline onK ðnÞ ðb T ; μÞ is our notation for the final parametrization to be used with evolution. The above applies to the cases where μ ≈ Q 0 , so as a final step we set μ ¼ μ Q 0 and write the underlined parametrization as This is the form of the parametrization for the CS kernel that we will ultimately use in Eq. (20). The errors iñ K ðnÞ ðb T ; μ Q 0 Þ, as an approximation toKðb T ; μ Q 0 Þ, are suppressed by at least α s ðμ Q 0 Þ nþ1 point by point for all b T .
A final constraint on parametrizations of K ðnÞ input ðk T ; μ Q 0 Þ is obtained by recalling that soft gluon effects cancel in collinear factorization when we integrate over all transverse momentum. Thus, after an integration of K ðnÞ input ðk T ; μ Q 0 Þ over k T up to a cutoff k max of order μ Q 0 , sensitivity to any nonperturbative mass parameters must vanish as m=μ Q 0 → 0. We may express this by demanding that where χ ðnÞ ðk max =μ; α s ðμÞÞ is either zero or a perturbatively calculable function, independent of any nonperturbative mass parameters in K ðnÞ input ðk T ; μ Q 0 Þ. Before continuing, let us summarize the basic properties of the parametrization,K ðnÞ ðb T ; μ Q 0 Þ:  (41) and (48) are satisfied for all b T . The resultingK ðnÞ ðb T ; μ Q 0 Þ is an accurate representation of the exactKðb T ; μ Q 0 Þ up to at most (nonlogarithmic) order α s ðμ Q 0 Þ nþ1 errors.
There is an ambiguity in the exact choice of functional form forQ 0 ðb T Þ in Eq. (42) in the region of b T ≈ 1=Q 0 , but this is just the usual scale uncertainty that appears in any truncated perturbation theory, akin to the dependence on the exact numerical choices for C 1 and C 2 . Since Q 0 and C 1 =b T are of similar size when b T ≈ 1=Q 0 , the effect of the transformation is under perturbative control and the ambiguity diminishes as one incorporates higher orders.
To state this more explicitly, consider a family of different choices forQ 0 ðb T Þ smoothly connected by an extra parameter a:Q The only requirement is that Eq. (49) satisfies Eq. (42) for all the a one might consider. Then, d daK ðnÞ ðb T ; μÞ In the second line, we have substituted Eq. (45) and in the last line we have used Eq. (41) while noting that at small b T the suppressed errors are enhanced by terms logarithmic in b T μQ 0 . However, by construction 1 So effects from varying the precise choice of transition functionQ 0 ðb T Þ are always one order higher in α s ðQ 0 Þ than the working order n. We will illustrate the steps above more concretely with specific examples in Sec. VII A.

V. PARAMETRIZATION OF THE TMD FFS
AT Q = Q 0 Now, we turn to the parametrizations of the input TMD ffs themselves. Following the strategy outlined in the Introduction, we will categorize regions as perturbative or nonperturbative for the input scale TMD ff in transverse momentum space. The steps will be very analogous to those just described in the previous section forKðb T ; μÞ. As in that case, we will use an "input" subscript to label the TMD ff parametrization that applies phenomenologically at the input scale Q ¼ Q 0 , and which is to be used in Eq. (22). For k T < Q 0 , the input parametrization will be defined to have a mainly nonperturbative transverse momentum dependence while for k T ≈ Q 0 or larger it will transition into its nth-order perturbative description, the first term in Eq. (23). Specifically, we define The only condition on the intermediate region between k T ≪ Q 0 and k T ≈ Q 0 is that it should be reasonably smooth. The input parametrization in Eq. (52) has the coordinate space representatioñ We will require that the input coordinate space parametrizations satisfy the evolution equations through the nth order in the evolution kernels, at least for the b T ≈ 1=Q 0 region: Usually, these will be satisfied automatically if the parametrization follows Eq. (52). Note the analogy between Eqs. (52)-(55) above and Eqs. (39)-(41) for the CS kernel.
Finally, for the integrated TMD ff to be consistent with the definition in Eq. (30), we must impose it directly on the parametrization, GONZALEZ-HERNANDEZ, ROGERS, and SATO PHYS. REV. D 106, 034002 (2022) 034002-14 Here, we have introduced new notation and another definition. The underline on d ðn;d r Þ c ðz; μ Q 0 Þ is meant to indicate that this is a specific parametrization (one determined by the D (30). In accordance with Eq. (32), it is to have, by its definition, the property that The parametrization d It is worth pausing to review the different types of cutoff collinear ffs that we have introduced so far, given that there are now at least three. First, the version in Eq. (30) with no underlines or superscripts is the exact d c ðz; μ Q 0 Þ that follows from the abstract operator definitions. Second, the d ðn;d r Þ c ðz; μ Q 0 Þ above is a specific parametrization of that definition, with the only requirement being that in the limit of large μ Q 0 it reduces to the nth-order collinear perturbation theory calculation in terms of renormalized ffs with scheme r. Finally, there is the d Because Eq. (56) is just a definition, it contains no constraint by itself. The constraint is in Eq. (57).
So far, the steps for constructing the TMD ff parametrizations are very analogous to those of Sec. IV for Kðk T ; μ Q 0 Þ, but there are some differences. The most significant is that the "perturbative" large k T part of Eq. (52) is not entirely perturbative because it involves nonperturbative collinear ffs as input in Eq. (25). The perturbative contribution to large transverse momentum dependence only enters in the coefficient function C ðnÞ D ðzk T Þ. By contrast, the only input to the perturbative calculation in Eq. (37) is the strong coupling α s .
The conditions in Eqs. (52)- (56) are all that, we need for constructing phenomenologically useful parametrizations in the Q ≈ Q 0 region. Any model or parametrization that satisfies them is acceptable, but we will give some explicit examples in later sections.
However, the perturbative part of the parametrization in Eq. (52) does not provide an accurate description in the region of k T ≫ Q 0 , where ratios of k T and Q 0 diverge. In coordinate space, the same issue arises at b T ≪ 1=Q 0 in the form of large logarithms of μb T . That does not create a problem for phenomenological applications near Q ≈ Q 0 where the k T ≫ Q 0 contributions are suppressed in the integral of Eq. (21). However, it becomes important as one evolves to Q ≫ Q 0 and the k T ≫ Q 0 region starts to contribute more significantly.
Therefore, there needs to be a scale transformation from μ ¼ μ Q 0 to μ ¼ C 1 =b T in the coordinate space TMD ff in the region of b T just below b T ≈ 1=Q 0 . This, of course, is exactly the same issue that we faced in the case of K ðnÞ input ðb T ; μ Q 0 Þ in the previous section. For the TMD ff, it also implies that we have to evolve the CS scale ffiffi ffi ζ p from Q 0 to C 1 =b T . For the scale change, we can reuse the same scale transformation function from Eq. (42).
The exact solution to the TMD evolution equations for an individual TMD ff evolving from scales μ i , Q i to Or, if we use μ i ¼ μQ 0 , Q i ¼Q 0 for the input scale, As of yet, there are still no approximations onDðz; b T ; μ Q 0 ; Q 2 0 Þ. The lef-hand side has no dependence on Q 0 ; anyQ 0 dependence inDðz; b T ; μQ 0 ;Q 2 0 Þ is exactly canceled by an oppositeQ 0 dependence in the exponential factor. Now, we can substitute approximations into the righthand side of Eq. (60) in a way that is again very analogous to the way, we handledKðb T ; μÞ in the previous section by making substitutions on the right-hand side of Eq. (44). We approximateDðz; b T ; μQ 0 ;Q 2 0 Þ on the right-hand side of  (52). Because of the scale transformation, the result is a parametrization ofDðz; b T ; μQ 0 ;Q 2 0 Þ that is an accurate description not only for b T ≈ 1=Q 0 and larger but also for all b T ≪ 1=Q 0 . For theKðb T ; μQ 0 Þ in the exponent on the right-hand side of Eq. (60), we already have the analogous result from Eq. (46) in Sec. IV, and we can reuse it here. All that remains then is to substitute γðα s ðμ 0 Þ; 1Þ and γ K ðα s ðμ 0 ÞÞ by their truncated nth-order perturbation theory approximations. Thus, our final parametrization of the input TMD ff is The underline onD ðn;d r Þ ðz; b T ; μ Q 0 ; Q 2 0 Þ is our notation for the final parametrization of the TMD ff at the input scale Þ has the following properties: (i) Each factor in Eq. (61) is an accurate approximation to the corresponding factor in the exact Eq. (60) point-by-point in b T . (ii) For b T ≈ 1=Q 0 or larger, the exponential evolution factor deviates from unity by a negligible amount. This is by our construction ofQ 0 ðb T Þ. (22) are equally appropriate for applications to phenomenology when Q ≈ Q 0 . Also, recall that theD ðn;d r Þ input ðz; b T ; μ Q 0 ; Q 2 0 Þ was originally formulated in transverse momentum space. (iii) There is a smooth scale transformation to μ ∼ ffiffi ffi ζ p ∼ 1=b T at small b T , but only when b T ≪ 1=Q 0 . Therefore,D ðn;d r Þ ðz; b T ; μ Q 0 ; Q 2 0 Þ continues to provide an accurate approximation to the exact TMD ff after Q ≫ Q 0 where the b T ≪ 1=Q 0 region starts to be relevant. (iv) Thus, at small b T , we may express the TMD ff in terms of collinear ffs d r using the usual OPE methods.
(v)D ðn;d r Þ ðz; b T ; μ Q 0 ; Q 2 0 Þ satisfies the exact evolution equations There are no error terms in either of these equations, and both Eqs. (62) and (63) are valid for all b T . (vi) By contrast, the evolution equations for the "input" subscript TMD ffs in Eqs. (54)-(55) do come with explicit error terms. As was the case forK ðnÞ ðb T ; μ Q 0 Þ, sensitivity to the choice of functional form forQ 0 ðb T Þ is the standard scale uncertainty in truncated perturbation theory, and it vanishes in the limit that Q 0 is large and/or high enough orders in α s ðQ 0 Þ are included. To see this, we may repeat steps analogous to those after Eq. (49): þ Oðα s ðμQ 0 Þ nþ1 Þ þ Oðb T mÞ: Thus, sensitivity to the choice ofQ 0 ðb T Þ, at any order n, vanishes as m=Q 0 → 0.
Since our notation has now grown rather extensive, we remind the reader that it is summarized in Appendix A.

VI. SUMMARY OF STEPS
So far, we have focused on describingDðz; b T ; μ Q 0 ; Q 2 0 Þ only at a fixed input scale Q 0 . Now all that is necessary to calculate Wðq T ; QÞ at any other scale using D ðn;d r Þ ðz; b T ; μ Q 0 ; Q 2 0 Þ andK ðnÞ ðb T ; μ Q 0 Þ is to substitute them into the right-hand side of Eq. (20), along with the nth-order perturbative expressions for Hðα s ðμ Q Þ; C 2 Þ, γðα s ðμ 0 Þ; 1Þ, and γ K ðα s ðμ 0 ÞÞ. The result is an approximation for Wðq T ; QÞ that includes evolution and is accurate for Q ≥ Q 0 , The approximation, notated by the (n) superscript on W ðnÞ ðq T ; QÞ, is such that the scale dependence given by the evolution equations in Eqs. (17) While it might appear that, we have only succeeded at introducing an excessive amount of notation, the end result is a fairly simple recipe for combining any arbitrary model of nonperturbative transverse momentum dependence with full TMD factorization and evolution. After some basic initial decisions such as choosing a value for Q 0 and fixing renormalization schemes, the steps are as follows: (A) Model building A1: Choose a nonperturbative model, or a nonperturbative technique more generally, to phenomenologically parametrize the small transverse momentum dependence in the TMD ff Dðz A ; z A k AT ; μ Q 0 ; Q 2 0 Þ and in Kðk T ; μ Q 0 Þ at the input scale. (See, for example, the list of models in the Introduction. These can likely be used here.) A2: For step A1, make any modifications to the models that are necessary to ensure that they satisfy Eqs. (39), (41), (47), (52), and (54)- (57). This step mostly amounts to extrapolating existing models to low order perturbative descriptions of k T ≈ Q 0 behavior. The result is a set of parametrizations : Choose a functional form for theQ 0 ðb T Þ in Eq. (42) to implement the transition between scales. Use the "input" functions from step A2 to constructK ðnÞ ðb T ; μ Q 0 Þ and D ðn;d r Þ ðz; b T ; μ Q 0 ; Q 2 0 Þ via Eqs. (45) and (61).
(B) Phenomenology at Q ≈ Q 0 B1: Apply factorization phenomenologically to Q ¼ Q 0 , Type I processes by taking in Eq. (65). This corresponds to the TMD parton model formula in Eq. (22) with the input function of step A2. Fix any parameters in the nonperturbative model. This step is essentially no different from traditional TMD parton model motivated approaches to describing Type I processes. Thus, prior existing phenomenological results can likely be reused here. B2: Consider the phenomenological behavior of cross sections in a region of Q around Q ≈ Q 0 . Takẽ in Eq. (65), and use the resulting formula in phenomenological fits to fix any nonperturbative parameters iñ K ðnÞ input ðb T ; μ Q 0 Þ in the Q ≈ Q 0 region. B3: Verify that the effect of replacingK ðnÞ ðb T ; μ Q 0 Þ andD ðn;d r Þ ðz; b T ; μ Q 0 ; Q 2 0 Þ byK ðnÞ input ðb T ; μ Q 0 Þ and D ðn;d r Þ input ðz; b T ; μ Q 0 ; Q 2 0 Þ, respectively, is negligible for numerical calculations around Q ≈ Q 0 .
(C) Phenomenology at large Q C1: Use Eq. (65) to evolve to significantly larger Q and make predictions for Type II observables. Then refit and/or tune the nonperturbative parameters and improve the agreement with the higher Q observables. The adjustment of parameters should be expected to be minimal since the larger Q measurements are less sensitive to large b T .
C2: Continue to repeat step C1 with even higher Q. One should expect the accuracy of predictions to increase, both because of the growing constraints on nonperturbative parameters from previous steps and because larger Q is less sensitive to large b T and more sensitive to small α s ðQÞ perturbative contributions.
It is possible to transform Eq. (65) into a form more familiar from traditional implementations of the CSS formalism. While that is not necessary, and indeed we would advocate for using Eq. (65) directly, at least for hadron structure applications, there may still be situations where it is desirable. The steps for transforming Eq. (65) into the more familiar form will be explained later in Sec. IX.
We have used the example of SIA and TMD ffs to have a concrete factorization formula for illustration purposes, but everything up to now carries over in obvious ways to TMD pdfs and processes such as Drell-Yan scattering. The steps bridge standard approaches to Type I and Type II phenomenology.
Before continuing on to examples, notice that we may rewrite Eq. (65) as by directly substituting Eqs. (46) and (61) into Eq. (65). Which form to use is a matter of convenience, but each highlights different aspects of evolution at moderate and/or large Q. Equation (65) is written in a way that makes it obvious that TMD factorization exactly follows a TMD parton model description at Q ¼ Q 0 . That form of W ðnÞ ðq T ; QÞ also makes it clear that if the TMD ffs and Kðb T ; μ Q 0 Þ are known exactly for all b T at a fixed input scale, the evolution to larger Q becomes technically trivial. Conversely, in Eq. (67), we have been able to eliminate Q 0 entirely from the expression, and the RG improved treatment of the Q ≫ Q 0 limit is automatic in the ∼1=b T scale dependence at small b T . Indeed, this form is very similar to the standard presentation of the W ðnÞ ðq T ; QÞ, as we will see in Sec. IX.

VII. LOW ORDER EXAMPLES
We mean for the steps in Sec. VI to apply to any model or nonperturbative method of parametrizing the intrinsic transverse momentum, so we have left the nonperturbative parts in Eqs. (39) and (52) completely general. But specific examples make the steps much clearer, so we will illustrate them in this section with very minimalistic models of the nonperturbative intrinsic transverse momentum dependence.
We caution that the nonperturbative parametrizations that we will use below should be regarded only as toy examples at this stage, only to be used for illustration purposes. In future work, we hope to implement the steps in Sec. VI with input from at least some of the more sophisticated models or nonperturbative techniques referenced in the Introduction.
Since this section makes much use of the notation we have introduced in earlier sections, we remind the reader once more of our notation glossary Appendix A. We will assume that MS renormalization is used everywhere, and we will always use Q 0 ¼ 2 GeV for making example plots.

A. CS kernel example
To motivate a simple model for the nonperturbative part of the CS kernel, let us recall the physical effect that a nonzeroKðb T ; μ Q 0 Þ has on momentum-space cross sections at small transverse momentum. Consider the CS evolution equation for WðQ; q T Þ in transverse momentum space. Fourier-Bessel transforms become transverse momentum convolutions so (See, for example, Eq. (25) of [69].) The partial derivative here indicates that z A and z B should be fixed. The "Const" is independent of transverse momentum and is related to the anomalous dimension γ. It only contributes to a Q-dependent normalization. Our task is to find a reasonable input Kðk T ; μ Q 0 Þ parametrization that satisfies Eq. (39) and also gives a good phenomenological description of cross sections in the region of Q ≈ Q 0 close to the input scale. One way that Kðk T ; μÞ can contribute to the evolution of WðQ ≈ Q 0 ; q T Þ is simply through a normalization. To capture that behavior in Eq. (68), the model parametrization should include a term proportional to a δ function, This is an example of step A1 applied to Kðk T ; μ Q 0 Þ. But as step A2 prescribes, K input ðk T ; μ Q 0 Þ also needs to match the large k T perturbative description, where for now we work at order n ¼ 1. We also know that [For a textbook derivation of Eq. (71), see Sec. 13.10.2 of [3]. Also see Ref. [90] for earlier calculations. For higher orderK expressions, see also [91][92][93][94], and see Ref. [41] for translating between different notations. See also [95] for more discussion of the operator definition.] As k T decreases below Q 0 , Eq. (70) needs to transition into a nonperturbative eparametrization in a way that is still phenomenologically successful at describing Q ≈ Q 0 behavior. Existing evidence, both theoretical and phenomenological [76,77,96] and from lattice calculations [97], points toward a shape for TMD pdfs and ffs that varies only very weakly with scale in the Q ≈ Q 0 region. Our trial parametrization will reproduce this behavior if it is fairly sharply peaked around k T ≪ Q 0 and then falls off rapidly for larger k T . Equation (68) with K ð1Þ ðk T ; μ Q 0 Þ captures that general behavior if we make the replacement k 2 T → k 2 T þ m 2 K and keep the nonperturbative parameter m K small relative to Q 0 . Thus, we obtain a reasonable candidate for a K ð1Þ input ðk T ; μ Q 0 Þ parametrization that satisfies Eq. (39) if we combine the k 2 T → k 2 T þ m 2 K modification of Eq. (71) with Eq. (69): The transformation into coordinate space is Satisfying both Eqs. (41) and (47) with the MS expression for γ ð1Þ K requires So the input CS kernel is just the single parameter functioñ Note that the same mass m K appears in Eq. (74) and the first term of Eq. (75) reproduces the known lowest order coordinate spaceK ð1Þ ðb T ; μ Q 0 Þ in MS at small b T : At large b T , we get the expected (see [69] Sec. VII-A) constant negative behavior, This completes steps A1 and A2 insofar as they pertain to the CS kernel.
To get aK ð1Þ ðb T ; μ Q 0 Þ that can be extended to calculations ofKðb T ; μ Q 0 Þ at b T ≪ 1=Q 0 , we need to proceed with step A3 and choose a form for the scale transition functionQ 0 ðb T Þ. For now we will use the form in Eq. (C1) from Appendix C for any numerical calculations and plots. Later, we will demonstrate that the details of this choice do not significantly affect calculations.
Finally, we getK ð1Þ ðb T ; μ Q 0 Þ by substituting the trial K ð1Þ input ðb T ; μ Q 0 Þ from Eq. (75) into Eq. (46), for all b T . This completes step A3 from Sec. VI, and it completes the model building for the CS kernel with n ¼ 1. To determine the phenomenological parameter m K , one would need to then proceed to the next steps in Sec. VI, in particular B2. But for that, one needs to first construct a model for the TMD ff itself, which we will do in the next subsection.
Before proceeding to that step, it is instructive to examine the trialK ð1Þ ðb T ; μ Q 0 Þ graphically and to verify that the properties described above hold. For example, we expect only a numerically mild, perturbatively suppressed, dependence inK ð1Þ ðb T ; μ Q 0 Þ on the scale transformation parameter a in the region between small and large b T , consistent with Eq. (79). The lower panel of Fig. 2 shows plots of our trialK ð1Þ ðb T ; μ Q 0 Þ with Q 0 ¼ 2 GeV and m K ¼ 0.1 GeV, while the top panel shows two slightly different functional forms forQ 0 ðb T Þ from Eq. (42), as given in Appendix C and Eq. (C2). The two transition functions are generated by using two different values of a, a ¼ 2 GeV and a ¼ 4 GeV. The narrow central panel shows the percent variation between the twoQ 0 ðb T Þ, and it can be seen from the graph that they differ by a maximum of about ≈30% for b T ≈ 0.3 GeV −1 , but for b T much larger or smaller than this the variation vanishes as per our requirements. TheK ð1Þ ðb T ; μ Q 0 Þ calculation in the lower panel of Fig. 2 was performed using both of the twoQ 0 ðb T Þ functions in the top panel. We have plotted them as solid black and dashed-red curves, but the two are visually indistinguishable, confirming the good approximate scale insensitivity in Eq. (79). Despite the clearly visible difference between the twoQ 0 ðb T Þ functions in the upper panel in the region just below b T ≈ C 1 =Q 0 ≈ 0.5 GeV −1 , the effect of switching between them essentially vanishes in theK ð1Þ ðb T ; μ Q 0 Þ calculations of the lower panel.
The normal way to calculateKðb T ; μ Q 0 Þ in the small-b T limit is to transform to an RG scale, μ Q 0 → C 1 =b T , and use low order perturbative calculations. We should thus expect to recover this in the small-b T limit of theK ð1Þ ðb T ; μ Q 0 Þ from Fig. 2. To check this, let us define a "purely perturbative" calculation of the small-b TK ðb T ; μ Q 0 Þ called K ð1Þ pert ðb T ; μ Q 0 Þ: We have shownK ð1Þ pert ðb T ; μ Q 0 Þ as the violet dashed curve in the lower panel of Fig. 2. As expected, the purely perturbative curve diverges at large b T , while it merges with theK ð1Þ ðb T ; μ Q 0 Þ curves for b T ≪ 1=Λ QCD .
Care is needed when interpreting the numerically small Q 0 ðb T Þ dependence observed in plots such as Fig. 2. In cross section calculations such as Eq. (20),K ð1Þ ðb T ; μ Q 0 Þ multiplies a lnðQ 2 =Q 2 0 Þ, so small errors will be amplified as Q ≫ Q 0 . The effect ofK ð1Þ ðb T ; μ Q 0 Þ on the b T -space cross section is entirely through the overall factor of exp fK ð1Þ ðb T ; μ Q 0 Þ lnðQ 2 =Q 2 0 Þg, as seen in Eq. (20). So, 10-1 100 bT(Gev- 1 ) we get a better sense of the size of errors after evolution by plotting this exponent for a wide range of Q > Q 0 . For example, we may calculateK ð1Þ ðb T ; μ Q 0 Þ using different Q 0 ðb T Þ and compare the resulting exp fK ð1Þ ðb T ; μ Q 0 Þ × lnðQ 2 =Q 2 0 Þg for several Q > Q 0 . We have shown an example of this in Fig. 3 for the two different choices of Q 0 ðb T Þ in the top panel of Fig. 2. In the top panel of Fig. 3, we have replotted the twoK ð1Þ ðb T ; μ Q 0 Þ curves from the lower pane in Fig. 2, but now with axis ranges shifted until the variation between the two calculations starts to become visible on the plot. It is evident from the curves that, we should expect any effect from varying a to be significant only in a band around b T ≈ 0.3 GeV −1 . In the lower panel, we have plotted the ratio of exp fK ð1Þ ðb T ; μ Q 0 Þ lnðQ 2 =Q 2 0 Þg for the two different values of a from the top panel. That is, we plot rða 1 ; a 2 Þ ≡ exp fK ð1Þ ðb T ; μ Q 0 Þ lnðQ 2 =Q 2 0 Þgj a¼a 2 exp fK ð1Þ ðb T ; μ Q 0 Þ lnðQ 2 =Q 2 0 Þgj a¼a 1 for Q ¼ 4 GeV and Q ¼ 100 GeV. After evolving from Q ¼ 2 GeV to Q ¼ 100 GeV, the maximum effect is just under 6%. We will see in more detail how the scale sensitivity propagates to the cross section in Sec. VII C. So far, our example plots have only used m K ¼ 0.1 GeV, but nonperturbative parameters such as these will in general need to be adjusted in fits. In our setup, adjustments to parameters such as m K will have a negligible effect on the b T ≲ 1=Q 0 region ofK ð1Þ ðb T ; μ Q 0 Þ so long as m K is kept small relative to Q 0 . To illustrate this, we have plotted K ð1Þ ðb T ; μ Q 0 Þ once again in Fig. 4 for several values of m K , now on a linear horizontal axis to magnify the effect on the large b T region. The plot confirms that the region of b T ≲ 0.5 GeV −1 is essentially unaffected by the values of the m K parameter between ≈0.1 GeV and ≈0.5 GeV, so long as those values are kept reasonably small relative to Q 0 . A vertical line indicates the b T ¼ 0.3 GeV −1 position where we previously found the greatest scale sensitivity in the perturbative part of the calculation-the peak of the bump in Fig. 3. In contrast to the small scale sensitivity in the bottom panel of Fig. 2, sensitivity to changes in the value of m K is large and clearly visible, but only in the region of large b T . The step of fitting the purely nonperturbative parameter has been sequestered from the treatment of the transition into the perturbative regime. In phenomenological applications, one converges on an unambiguousK as one repeats the steps above but with higher orders for the large k T region. That amounts to constructing parametrizations forK ð2Þ ðb T ; μ Q 0 Þ, K ð3Þ ðb T ; μ Q 0 Þ, etc. Going to larger n reduces sensitivity to arbitrary choices such as the functional form forQ 0 ðb T Þ. Quantities such as a and m K are also increasingly constrained as more data from larger Q are included in fitting.
Extending the above construction ofK ð1Þ ðb T ; μ Q 0 Þ to the case ofK ð2Þ ðb T ; μ Q 0 Þ is straightforward and instructive, but we leave it to future work.

B. TMD ff example
Next, we need to repeat steps A1-A3 from Sec. VI for the TMD ffs themselves. To keep the discussion here simple, we will assume that the TMD ffs are the same for hadrons A and B, and we will continue to focus only on the n ¼ 1 case. Fortunately, the steps are very analogous to the CS kernel, so much of the below will be repetition.
A typical parametrization, common in TMD partonmodel descriptions of Type I processes, is a Gaussian, This fails to satisfy Eq. (52) when, we try to extend it directly to n ¼ 1 because it does not have the right functional form to match to D ðn;d r Þ ðz; zk T ; μ Q 0 ; Q 2 0 Þ when k T ≈ Q 0 . To construct a TMD ff for n ¼ 1, we need to describe the transition from a nonperturbative peak such as Eq. (82) to a perturbative large k T power-law tail. The simplest way to do this is to just append D ðn;d r Þ ðz; zk T ; μ Q 0 ; Q 2 0 Þ to Eq. (82) as an additive term. Inside D ðn;d r Þ ðz; zk T ; μ Q 0 ; Q 2 0 Þ, we can then make the replacement k 2 where m D is a nonperturbative parameter, to smooth the k T → 0 behavior into a nonperturbative peak, analogous to what we did in Eq. (72). Thus, our trial input parametrization is where we have utilized the following abbreviations: For a textbook derivation of the large k T perturbative behavior see, for example, Eqs. (13.101) and (13.66) of Ref. [3]. At the order α s , we are working in, the expressions are independent of the exact renormalization scheme r used for the collinear ffs, so we have left it general in Eq. (83) for now. We will specialize to r ¼ MS later. The TMD ff should at least approximately match its collinear perturbative expansion for k T ≈ Q 0 , so m D and M should be kept small relative to Q 0 in any fits.
One of our requirements from step A2 is that parametrizations of the TMD ffs must satisfy the integral relation in Eq. (56) with Eq. (57) satisfied. We will use this constraint to fix C ðd r Þ , which so far is just another nonperturbative parameter. Evaluating the transverse momentum integral of Eq. (83), expanding in small m=μ Q 0 , and solving for C ðd r Þ in terms of d We still need to choose a form for d ð1;d c Þ c ðz; μ Q 0 Þ, but our requirement is that it must satisfy Eq. (57), where we allow the m=μ Q 0 -suppressed contributions to be chosen to give an optimal parametrization. Thus, let us define the powersuppressed terms in Eq. (57) (again, for n ¼ 1) to exactly equal those in Eq. (86). Then, where in the last line we have used Eq. (31). These last few steps are necessary if we wish to relate C ðd r Þ to known collinear ffs in standard schemes. Equation (83), with Eq. (87) now for C ðd r Þ , is the parametrization of the TMD ff that is to be substituted into Eq. (22) and, in accordance with step B1, used phenomenologically for describing Type I processes with standard TMD parton model techniques near the input scale Q ≈ Q 0 . To get a general sense of what the Eq. (83) parametrization of the TMD ff looks like at Q ¼ Q 0 , we have plotted it in Fig. 5 using MS collinear ffs and reasonable values of the mass parameters m D and M. (The nonperturbative mass parameters are kept small relative to Q 0 ¼ 2.0 GeV.) For this case, Δ ðn;d MS Þ ðα s ðμ Q ÞÞ is given by Eq. (34). The plot is for a sample value of z ¼ 0.3, but other values of z produce qualitatively similar curves, as can easily be checked. Since there is a general expectation from existing Type I TMD phenomenology that the small transverse momentum region in moderate Q processes is well-described by Gaussian TMDs, we have overlaid a Gaussian curve on top of Eq. (83), confirming that the k T ≪ Q 0 region retains a generally Gaussian shape.
As per step B1, the small k T region is to be described by fitting the m D and M parameters to measurements. So long as these mass parameters are reasonably small compared to Q 0 , the parametrization at least approximately recovers the lowest order perturbative description in collinear factorization around k T ≈ Q 0 . In principle M and m D can both have z dependence: but we will not include this in any of our example plots. While, we intend for the above parametrization to be only a toy example to illustrate broader procedural points, it is worth noting that at least some phenomenological support for an additive two-component model [such as Eq. (83)] exists in the observation from [42] that a sum of two peaked nonperturbative functions provides a good fit to data in the moderate Q region. In that case, the two peaked functions were both Gaussians, but the trend is nevertheless suggestive of a two component form more generally. Reference [98] has also confirmed that a combination of a Gaussian peak at small transverse momentum and a power-law tail at large transverse momentum provides a reasonable description of moderate Q data.
The two model parameters in our construction have natural interpretations: The lowercase m D describes exactly how the TMD ff transitions from the Gaussian-like behavior typically ascribed to nonperturbative dependence to the power-law behavior more typical of a perturbative tail, while the capital M parameters control the precise shape of the Gaussian peak at small k T .
The inverse Fourier-Bessel transform into transverse coordinate space is where C ðd c Þ is now defined as in Eq. (87). For checking various properties of the input TMD ff, it will also be useful to have the mb T → 0 limit, Or, specializing to r ¼ MS, Equation (91) is the usual small-b T OPE applied tõ Dðz; b T ; μ Q 0 ; Q 2 0 Þ in the MS scheme through order α s ðμ Q 0 Þ. Using Eq. (90), it is straightforward to verify Eqs. (54)- (55) by directly applying to Eq. (90) a partial derivative with respect to Q 0 and a total derivative with respect to μ Q 0 . This completes our steps A1 and A2 from Sec. VI. Now, we have aD ð1;d c Þ input ðz; b T ; μ Q 0 ; Q 2 0 Þ parametrization that is suitable for Type I phenomenological applications.
Finally, for step A3 we need to construct a paramet-rizationD ðn;d r Þ ðz; b T ; μ Q 0 ; Q 2 0 Þ that applies not only near the input scale Q ≈ Q 0 but also to Q ≫ Q 0 . To switch to the final underlined parametrization, we can continue to use the sameQ 0 ðb T Þ from Sec. IV and Appendix C that we used for the CS kernel. Implementing step A3 amounts to simply substituting ourD ð1;d c Þ input ðz; b T ; μ Q 0 ; Q 2 0 Þ from Eq. (89) into the righthand side of Eq. (61) along with theK ð1Þ input ðb T ; μQ 0 Þ that we already constructed in Sec. VII A, Eq. (75). The final expression is This is simply Eq. (61) again but now we mean it to be implied that it is being used with the specific models from Eqs. (72) and (83). From Eq. (89), andK ð1Þ ðb T ; μQ 0 Þ is the same n ¼ 1 result already written in Eq. (78). C ðd MS Þ is given by Eq. (87). For illustration, Fig. 6 is a plot of our trial D ð1;d MS Þ ðz; b T ; μ Q 0 ; Q 2 0 Þ from Eq. (92), plotted in coordinate space, where as before we have used an input scale Q 0 ¼ 2 GeV and z ¼ 0.3. We have chosen typical sizes for the nonperturbative mass parameters: M ¼ 0.2 GeV, m D ¼ 0.3 GeV, and m K ¼ 0.1 GeV. As in the case of K ð1Þ ðb T ; μ Q 0 Þ, we are able to test scale sensitivity in the intermediate b T region by varying the transition function Q 0 ðb T Þ. In Fig. 6, we do this by again switching between the twoQ 0 ðb T Þ functions from the upper panel of Fig. 2; the solid black and dashed red curves are for a ¼ 2 GeV and a ¼ 4 GeV, respectively. The weakness of the observed variation confirms that the setup is behaving as intended [recall Eq. (64)]. As withK ð1Þ ðb T ; μ Q 0 Þ, sensitivity to parameters such as a can in principle be reduced still further by including higher orders and fitting at larger Q. This requires matching to a higher order treatment of the large q T tail-see, for example, Refs. [93,99,100].

C. Cross section examples
With Eq. (92) now completely set up, all that is needed to get the cross section is to substitute it, along with Eq. (78), into Eq. (65) to obtain a calculation of W ð1Þ ðq T ; QÞ for any Q ≥ Q 0 . To illustrate how the features of theD ð1;d MS Þ ðz; b T ; μ Q 0 ; Q 2 0 Þ andK ð1Þ ðb T ; μ Q 0 Þ parametrizations from the previous subsections influence W ð1Þ ðq T ; QÞ, and to finish reviewing the steps of Sec. VI, we will end this section by examining several example plots of W ð1Þ ðq T ; QÞ. First, Fig. 7 shows q T W ð1Þ ðq T ; QÞ (divided by an uninteresting normalization H ð1Þ ) plotted versus q T and with a selection of Q values just above the input scale Q 0 ¼ 2 GeV. The nonperturbative parameters are the same M ¼ 0.2 GeV, m D ¼ 0.3 GeV, and m K ¼ 0.1 GeV values that we used in our illustrations from the previous subsection, and theQ 0 ðb T Þ is the same transition function from the upper panel of Fig. 2 with a ¼ 2 GeV. The solid black curves are obtained if we substitute the underlined functions of Eqs. (78) and (92) into Eq. (65). For comparison, the blue dot-dashed curves are obtained when we simply substitute the input parametrizations, Eqs. (75) and (91), into Eq. (65), instead of the final underlined parametrizations optimized for the small b T limit. Showing both curves confirms that switching between "input" and underlined parametrizations results in a negligible difference in the cross section calculation at q T ≪ Q when Q is only slightly larger than Q 0 . In Fig. 7, the difference between the solid black and dot-dashed blue curves is nearly invisible for Q between Q 0 and 2Q 0 , and only becomes significant for Q ≈ 5Q 0 and larger q T . Making these observations corresponds to step B3 from Sec. VI. They confirm that the input or the underlined parametrizations are both valid and interchangeable for phenomenological Type I applications near Q ≈ Q 0 .
Next, Fig. 8 shows the W ð1Þ ðq T ; QÞ calculation (from now on, we will always use the underlined parametrizations) plotted against q T on a logarithmic scale for a selection of Q covering a large range between Q 0 ¼ 2 GeV and Q ¼ 1000 GeV. The top panel is calculated with a fixed scale transformation functionQ 0 ðb T Þ, specifically the a ¼ 2 GeV curve in Fig. 2. The bands in the top panel were generated by varying the parameters M, m D , and m K associated with intrinsic transverse momentum by 50% around the values we used in the previous subsections, M ¼ 0.2 GeV, m D ¼ 0.3 GeV, and m K ¼ 0.1 GeV. The lower panel in Fig. 8 shows the same W ð1Þ ðq T ; QÞ calculations, but now with the intrinsic nonperturbative scales fixed to their previous values. Instead, the bands are generated by varying the scale transformation function Q 0 ðb T Þ between the two curves in the upper panel of Fig. 2 corresponding to a ¼ 2 GeV and a ¼ 4 GeV. Comparing t>, · · · · · · · · · · · . , , , , , , , . , . the upper and lower panels in Fig. 8 shows that, within the range of parameters considered here, scale sensitivity is far weaker than the sensitivity to intrinsic transverse momentum parameters. On the logarithmic scale, the scale sensitivity is only visible, even at large Q, around the node where W ð1Þ ðq T ; QÞ crosses zero. Nevertheless, sensitivity to intrinsic transverse momentum parameters does clearly diminish with increasing Q, especially for q T ≳ Q 0 . To get a sense of how rapidly it decreases with our parametrizations, we have plotted the top curves from Fig. 8 again in Fig. 9, but now with linear axes and only for the region of q T < 4.0 GeV in order to magnify sensitivity to variations in M, m D , and m K . We have also normalized the curves by their values at q T ¼ 0. The upper panel in Fig. 9 shows the bands for smaller Q ¼ 2 GeV and Q ¼ 10 GeV scales, and it shows that the small transverse momentum region q T ≪ Q is very sensitive to nonperturbative intrinsic mass scales. At q T ¼ 0, the width of the band is about an order of magnitude for Q ≈ Q 0 . The lower panel of Fig. 9 shows the bands for the larger Q ¼ 100 GeV and Q ¼ 1000 GeV scales. There, the sensitivity to intrinsic mass parameters at q T ¼ 0 is much weaker than for the smaller Q, and it becomes essentially invisible above about q T ≳ 0.02Q. This weak sensitivity to intrinsic nonperturbative transverse momentum parameters at q T ≫ m will be especially important if we need to consistently match to fixed order asymptotic calculations [101].
In phenomenological applications such as fitting, nonperturbative parameters such as M, m D , and m K , along with scale setting choices such as the value of a, become better constrained each time data from somewhat higher Q are included in the fitting. As higher Q are incorporated into fits, and input parameters become better constrained, it eventually becomes unambiguous how to evolve to still higher Q. The steps that, we have described in this subsection, along with the plots used to illustrate them, correspond to steps C1 and C2 in Sec. VI.
The illustrative examples in this subsection are to confirm that the setup in Sec. VI reproduces general expectations. We emphasize once again, before closing the discussion of examples, that our purpose here is not to advocate for a particular choice of a model parametrization for small k T dependence, but rather to illustrate the general steps from Sec. VI in concrete situations. Ultimately, it is up to phenomenological tests to assess the success of any particular model or calculation of nonperturbative transverse momentum dependence.

VIII. INTEGRAL RELATIONS II
It is worthwhile to return again to the integral relations discussed in Sec. III in light of parametrizations such as the one we constructed above. Now recall how integral relations often appear in phenomenological extractions of TMD functions near the input scale. For Q ≈ 1-2 GeV and k T ≪ Q 0 , it is well-known that the shapes of transverse momentum distributions are generally well approximated by Gaussians, so one might reasonably adopt a parametrization of the form where we have dropped subscripts and superscripts to simplify expressions. The TMD and collinear ffs parametrized in this way automatically satisfy the parton model integral relation A parametrization such as Eq. (94) imposes a strong suppression on large k T , cutting off the large k T tail. Because large k T dependence is generally regarded as perturbatively calculable, it is tempting to assume that it is possible to fix Eq. (94) in later steps simply by appending a purely perturbative tail for the region of k T ≳ Q 0 while leaving the initial Gaussian of Eq. (94) completely unchanged. Taking that view literally would mean that, as regards the integral relation, one does not need to modify either Eqs. (94) or (95) other than to explicitly indicate an upper cutoff on the integral on the right-hand side of Eq. (95) and to specify that higher orders in α s and/or 1=Q 0 are neglected: A problem with this approach can be seen, however, in our examples from the previous section, where Eq. (96) would amount to applying the transverse momentum integral in Eqs. (96) to (83) But then the integral on the left-hand side of Eq. (96) becomes The last two terms involving A ðd r Þ and B ðd r Þ would thus need to be identified with the "small corrections" of Eq. (96). However, the factorization theorem applies to the limit that m=μ Q 0 → 0, so the last two terms are not the purely perturbative corrections implied by an expression such as Eq. (96). While A ðd r Þ ðz; μ Q 0 Þ and B ðd r Þ ðz; μ Q 0 Þ involve a coupling α s ðμ Q 0 Þ that vanishes asymptotically such as at large μ Q 0 , the logarithms in Eq. (98) more than compensate for this in the limit of large μ Q 0 . Indeed, the term in Eq. (98) involving B ðd r Þ ðz; μ Q 0 Þ blows up as To keep a consistent integral relation that matches a parton model interpretation while also accounting for tails, evolution, etc., the coefficient of the Gaussian in Eq. (94) needs to be an expression such as Eq. (87) rather than a simple collinear ff.
To write a version of Eq. (96) that is consistent with the presence of a large k T tail region, as with our example in Sec. VII B with Eq. (89), it was necessary to interpolate between the nonperturbative and tail regions first.

IX. COMPARISON WITH THE STANDARD PRESENTATION
After step C2 of Sec. VI, we noted that it is possible to recast final results into a form more familiar from past applications of TMD evolution in Type II contexts. We will show how to perform that translation in this section. We emphasize that the steps below are not necessary for implementing the approach above, so this section may be skipped without missing the main points of this article. Most of the steps below amount to reshuffling factors in the cross section expression. Ultimately, however, the translation is important for comparing approaches.

A. The b Ã method
Readers who are familiar with standard implementations of the CSS formalism might find the surface appearance of our expressions for the evolved W ðnÞ ðq T ; QÞ in Secs. IV-VII somewhat odd. Normally, the cross section is written with nonperturbative TMD effects contained in separate exponential factors usually notated where the nonperturbative transverse momentum dependence is encoded in the (coordinate dependence of) the lowercase g functions in the exponents. (The g A and g B functions are placed inside exponents so that they retain the appearance of Sudakov form factor contributions.) Then, in the usual presentation, the rest of the factors in the cross section automatically get expressed in terms of collinear functions by using the OPE to approximate the small b T behavior. For a specific example of what we mean here, consider Eq. (13.81) of [3].
In this subsection, we will review the steps for transforming the low-q T cross section [or rather Wðq T ; QÞ] in Eq. (20) into the form that involves Eq. (99) g functions.
Setting up the usual presentation of the cross section begins with a partition of the coordinate spaceWðb T ; QÞ into regions considered large and small b T . One does this by defining an arbitrary function of b T , traditionally called b Ã ðb T Þ. The function should smoothly interpolate between b T at small values of b T and a maximum transverse size b max as b T grows to b T ≫ b max . It is otherwise arbitrary. In other words, The value of b max is also arbitrary, but it is usually interpreted roughly as a value somewhere near the boundary between nonperturbatively large and perturbatively small regions of b T . The purpose of the "b Ã method" [102] is to sequester a purely perturbative calculation of transverse coordinate dependence away from a part that involves nonperturbative modeling or fitting. While any reasonably well-behaved, smooth function of b T that obeys the right-hand side of Eq. (100) is a valid b Ã ðb T Þ, the most often used choice is Later on, we will also need to define another hard scale that approaches the RG improved value of μ ¼ C 1 =b T appropriate to the b T → 0 limit but that levels off at a fixed scale at large b T . The simplest (and standard) way to do this is to just use the inverse of b Ã and define The behavior of μ b Ã is similar to that of ourQ 0 ðb T Þ, but Q 0 ðb T Þ approaches Q 0 at large b T while μ b Ã approaches C 1 =b max . [Indeed, in our treatment below we could opt to usē Q 0 ðb T Þ instead of μ b Ã , but we will continue with μ b Ã to make the comparison with standard expressions clear.] Next, one solves Eq. (18) to relate a TMD ff (for hadron A, for example) at an input scale Q 0 to the TMD ff at any other scale ffiffi ffi ζ p bỹ Exactly the same equation applies independently of the transverse coordinate b T , so we also havẽ D A ðz; b Ã ; μ; ζÞ Then the ratio of Eqs. (103) and (104) isD where on the second line, we have defined Since the μ dependence ofKðb T ; μÞ is also b T independent, g K ðb T Þ is μ independent. That is, μ dependence cancels between the two terms, so g K ðb T Þ is μ independent by definition. Also, on the last line of Eq. (105), we have used that the μ dependence ofD A ðz; b T ; μ; Q 2 0 Þ is a b Tindependent overall factor-recall the evolution equation in Eq. (19)-to specialize to the case of μ ¼ μ Q 0 .
Next, one defines the logarithm of the ratio on the last line of Eq. (105) by the symbol −g A ðz; b T Þ: with the A subscript reminding one of potential sensitivity to the identity of the final state hadron. Combining Eqs. (105) and (107) gives TheD A ðz; b Ã ; μ; ζÞ on the right-hand side is still the exact operator definition, but it is only ever evaluated at b T ≤ b max . The remaining exponential factor is sensitive to the large b T region. As of yet, there are no approximations. In particular, any sensitivity to b max or the choice of the b Ã parametrization in Eq. (100) cancels exactly between the factors on the right-hand side of Eq. (108). We have simply taken the original definition ofD A ðz; b Ã ; μ; ζÞ and partitioned it into two factors. The logarithm on the right-hand side of the definition in Eq. (107) is cosmetic; expressing the nonperturbative ratio as the exponential of a function −g A ðz; b T Þ gives it the appearance of a type of contribution to a Sudakov exponent.
Despite the apparent arbitrariness of the above steps, one can aniticipate the motivation for writing the TMD ff as in Eq. (108) by looking ahead. We obtain the full cross section by substituting Eq. (108) into the evolved Wðq T ; QÞ in Eq. (20) with μ ¼ μ Q 0 and ffiffi ffi ζ p ¼ Q 0 . In the resulting cross section expression,D A ðz; b Ã ; μ Q 0 ; Q 2 0 Þ will be wellapproximated by collinear factorization at b T ≈ 1=Q 0 so long as b max ≈ 1=Q 0 and Q 0 is reasonably large compared to nonperturbative scales. This would be sufficient for calculations with Q ≈ Q 0 , where the 1=Q 0 ≲ b T < ∞ region is the only relevant contribution. However, if we plan to evolve to very large Q, then we also need an accurate treatment ofD A ðz; b Ã ; μ Q 0 ; Q 2 0 Þ in the b T ≪ 1=Q 0 limit. But the fixed order calculations ofD A ðz; b Ã ; μ Q 0 ; Q 2 0 Þ in collinear factorization are poorly behaved as b T μ Q 0 → 0, even though this is the limit where perturbative QCD should be most reliable.
As usual, therefore, we need to apply the evolution equations [Eq. (59)] once again in order to evolvẽ D A ðz; b Ã ; μ; ζÞ from μ, ζ to the RG-improved μ b Ã , μ 2 b Ã . The evolution equations allow us to rewrite Eq. (108) as Recall that μ b Ã is defined in Eq. (102). There is, of course, an exactly analogous equation (15) and setting the final scales equal to μ ¼ μ Q and ζ ¼ Q 2 in Eq. (15) gives Equation (110) is very close to the standard way of expressing the CSS-evolved W term. 3 As we have written it, there are still no approximations; the solutions to the evolution equations are exact and the steps above simply reorganize the original factorization formula in Eq. (15). However, by writing Wðq T ; QÞ as in Eq. (110), we have isolated on the first two lines those factors that can be confidently approximated in perturbation theory using collinear factorization. The value of b T never rises above b max , and the scale μ b Ã never drops below C 1 =b max . Therefore, one obtains well-behaved perturbative calculations by replacing Hðμ Q ; C 2 Þ, γðα s ðμ 0 Þ; 1Þ, γ K ðα s ðμ 0 ÞÞ, and Kðb Ã ; μ b Ã Þ by their nth-order perturbative calculations.
For the TMD ffs themselves on the first line, the choice of μ ¼ ffiffi ffi ζ p ¼ μ b Ã implements RG improvement for the limit of small b T . As long as b max is small enough, D A;B ðz A;B ; b Ã ; μ b Ã ; μ 2 b Ã Þ can be expanded in an OPE: which is a more explicit version of Eq. (25) but in b T space.
(Here, as usual, m represents any of the small intrinsic mass scales, including now 1=b max .) Substituting Eq.
along with the other perturbative approximations mentioned above, recovers the standard CSS expression-compare, for example, with the Drell-Yan version of TMD factorization in Eq. (22) of [69].
The b Ã method, as it is explained here, has several desirable properties. There is the elegant feature that, in dealing with the nonperturbative region of large b T , one never modifies or approximates the operator definitions of the TMD ffs themselves. Rather, on the first line of Eq. (110), we have simply changed their arguments from b T to b Ã . Along the same lines, the g functions on the last line have explicit definitions in terms of the underlying QCD operators. The final result for the cross section, Eq. (110), is exactly independent of the choice of the b Ã ðb T Þ function in Eq. (100) or of the value of parameters such as b max . Since changing them simply amounts to reshuffling contributions between the perturbative and nonperturbative factors, the b Ã independence is a version of RG invariance that we can express as Or, if we consider other more general b Ã ðb T Þ functions determined by a collection of possibly many parameters fb-paramsg, we can express the same relation schematically as d dfb-paramsg Wðq T ; QÞ ¼ 0: These relations are exact for Eq. (110). Therefore, it is legitimate to say that perturbative calculations of the first two lines in Eq. (110) are completely perturbative in that they have no dependence on nonperturbative parameters beyond whatever power suppressed errors are introduced when, we substitute Eq. (111). Thus, the b Ã method is not itself a model when used as originally set up. It is only a method for reorganizing contributions between the various factors in Eq. (110). For all of this to be preserved in practice when performing calculations and applying them to phenomenology, the Eq. (99) g functions need to be parametrized nonperturbatively but in a way that preserves Eq. (113). If a change of b Ã or b max induces a change in the perturbative parts, a compensating change is needed in the Eq. (99) functions. (For more on this, see, for example, the discussion in Sec. 13.13.2 of [3].) When specific parametrizations and approximations are substituted for the symbols in Eq. (110), the statements above that are exact become approximate. Ideally, however, one hopes to do this in such a way that the errors are small and controlled.
But implementing this style of approach in practice is complicated by the fact that it leaves very little said about the details of the Eq. (99) g functions other than that they must vanish such as a power at small b T . In particular, it leaves unaddressed the treatment of the transition between the purely perturbative and purely nonperturbative regions in the parametrizations, along with the role that integral relations such as Eq. (96) play in that transition. For very large Q, where the g functions are anyway expected to give only small corrections, these details might not have an important effect on calculations. But they become important once one enters regimes where one is sensitive to the details of hadron structure. In the earliest implementations, b 2 T power laws [103] and/or b T power laws [104] were proposed, and these work reasonably well for many applications. But if a g function from Eq. (99) is modeled with a simple ansatz, relations such as Eq. (113) tend to be violated in ways that can significantly affect results [68,105]. In some applications, the b Ã method and the choice of functions such as Eq. (100) get reinterpreted as being part of the nonperturbative model itself, due to the large b Ã sensitivity that is introduced by most simple ansatzes as discussed above. But then the perturbative calculation of the small b T behavior becomes intertwined with the nonperturbative modeling, which is a situation that the b Ã method of organization in Eq. (110) is meant to avoid.
These and related complications are all very well-known, and efforts have been made to overcome them. One essentially needs to reverse engineer the ansatzes for the g functions to recover some approximate consistency with relations such as Eq. (113). For example, Refs. [68,106] impose continuity for both the functions and their derivatives at a point b max in b T space that they designate as a boundary between perturbative and nonperturbative regions. Similarly, in treating g K ðb T Þ Ref. [69] proposed to expand the nonperturbative parametrization in a power series of b 2 max at small b T and match to the corresponding powers in the perturbative part ofKðb Ã ; μÞ.

B. Bottom-up in the b Ã method
Since the bottom-up procedure that, we described in Secs. IV-VI explicitly interpolates between nonperturbative and perturbative transverse momentum dependence, there was no need for a b Ã method. TheQ 0 ðb T Þ that, we introduced in Eq. (42) plays a role that is in some sense analogous to the μ b Ã from Eq. (102) in that both impose an RG improved μ ∼ 1=b T scale in the small b T limit. However, in the earlier sections of this paper, we never attempted to completely sequester purely perturbative and purely nonperturbative parts.
Nevertheless, the steps for the b Ã method carry over directly to our final expressions from the bottom-up procedure and Eqs. (61)- (65), and the problems summarized at the end of the last subsection are automatically avoided. One simply arrives at expressions for Eq. (99) that are connected to specific nonperturbative TMD models or calculations, constrained from the outset to satisfy all important properties. In this subsection, we will show the details of how to rewrite the bottom-up expression in Eq. (65) using the b Ã method. Most steps are simply a repetition of the normal way of writing the CSS evolution in terms of b Ã , μ b Ã , etc., that we reviewed in the last subsection but now use the underlined functions and their evolution equations.
By construction, all of the underlined objects satisfy the specific evolution equations in Eqs. (48), (62), and (63) exactly. Therefore, steps identical to those leading from Eqs. (103)- (110) apply also to Eq. (65) as long as we maintain all the appropriate underlines and (n) superscripts everywhere. They thus allow us to rewrite W ðnÞ ðq T ; QÞ as but now with The functions defined in Eqs. (115) and (116) are now written in terms of whatever model or nonperturbative calculation is being used at μ ¼ μ Q 0 , but just as with Eqs. (106) and (107), they are μ independent. The underlines and superscripts indicate that these g functions are defined with specific model parametrization and perturbative calculations in mind, rather than the purely abstract g functions of Eqs. (106)- (107). It is instructive to substitute the example parametrizations, Eqs. (78) and (92), that we constructed in Sec. VII into Eqs. (114)- (116) and confirm the b max independence and other expected properties. In Fig. 10, the solid black curves are the sameK ð1Þ ðb T ; μ Q 0 Þ as in Figs. 2 and 3, but for comparison we have also shown the separateK ð1Þ ðb Ã ; μ Q 0 Þ and −g ð1Þ K ðb T Þ curves. The two panels compare the graphs for different values of b max : the top panel is that appears in Eq. (65), and indeed the sum of the dashed and dot-dashed curves in Fig. 10 always reproduces exactly the solid blackK ð1Þ ðb T ; μ Q 0 Þ curve. The separate terms in Eq. (117) are different for different b max but, of course, K ð1Þ ðb T ; μ Q 0 Þ is not.
A similar comparison for −g ð1;d MS Þ A ðz; b T Þ is shown in Fig. 11. Now from Eq. (116) the combination of terms that is independent of b max is The solid black curve in (b max ¼ 0.1 GeV −1 ) and lower (b max ¼ 1.0 GeV −1 ) panels. The dashed and dot-dashed curves are for lnðD ð1;d MS Þ ðz; b Ã ; μ Q 0 ; Q 2 0 ÞÞ and −g ð1;d MS Þ ðz; b T Þ, respectively. The sum of the latter two always reproduces lnðD ð1;d MS Þ ðz; b T ; μ Q 0 ; Q 2 0 ÞÞ while −g ð1;d MS Þ ðz; b T Þ goes to zero such as a power at small b T .
Note carefully that there are no error terms in going from Eq. (65) to Eq. (114). So far, the expression is just another way to write Eq. (65). Plots of Eq. (114) that use our parametrization from Sec. VII produce figures identical to Figs. 7-9.
There are actually two ways that, we can recast the methods of Secs. IV-VI in the b Ã method. In the first, we can simply notice that Eq. (114) is already very close to the standard form. This version of the b Ã expression is literally identical to the bottom-up expression in Eq. (65). However, it requires that we use the exactK ð1Þ ðb T ; μ Q 0 Þ and In our specific examples from earlier, this would be Eqs. (78) and (92). Of course, this defeats the purpose of the b Ã approach.
However, now we can also use that Þ is the first term on the righthand side of Eq. (111). In other words, we can use purely perturbative expressions for these functions since they are constrained to arguments b T < b max . Dropping the order mb max terms and substituting Eqs. (119)-(120) gives us our last expression for W ðnÞ ðq T ; QÞ, The b Ã subscript on the left is to indicate that we have dropped the underlines onK ðnÞ ðb Ã ; μ b Ã Þ and D ðn;d MS Þ ðz; b Ã ; μ b Ã ; μ 2 b Ã Þ and we have neglected the Oðmb max Þ terms in Eqs. (119)- (120). These functions are just calculated in collinear perturbation theory now. Thus, Equation (121)   ðz B ; b T Þ, and g ðnÞ K ðb T Þ are, assuming that one has already followed the steps in Sec. VI. Whatever model, parametrization, or calculational technique we have used to describe the nonperturbative small transverse momentum dependence near the input scale (steps A1 and A2) simply gets substituted into the right-hand side of Eqs. (115)- (116). The integrand in Eq. (121) equals that of Eq. (65) up to corrections that vanish as a power of b T ≤ b max . As long as b max is small relative to the intrinsic mass parameters in the model, Eq. (113) is satisfied automatically. Now in the bottom-up approach here there are no drawbacks to choosing b max small enough to make the right-hand side of Eq. (122) as small as desired. It amounts simply to including more of the perturbative part of the calculation inside the g functions.
Of course, the above applies to any other ff renormalization scheme, and we have used MS because it is the most common.
One may check the above directly by using the explicit n ¼ 1 example from Sec. VII. Equations (114) and (121) are both plotted in Fig. 12 for two values of Q and with a range of b max . The solid black curves are the same W ð1Þ ðq T ; QÞ from Sec. VII with the same sample parameters, while the other curves are generated by Eq. (121) for a sequence of decreasing b max . As expected, Eq. (121) converges to Eq. (114)/Eq. (65), and b max dependence vanishes, for small enough b max . This happens more rapidly at the larger Q where there is less sensitivity to the large b T region. The size of the Oðmb T Þ errors are, of course, dependent upon the size of the parameters M, m K , m D , and a in the parametrization of the g functions. In practice, we may, of course, simply choose b max small enough that this error is negligible.
Compare what, we have done above with the way that g ðn;d MS Þ A ðz A ; b T Þ (and the analogous function for the B hadron) is often treated in top-down styles of approaches, where a g function is usually modeled by an ansatz: Here, the fc 1 ; c 2 ; …g is a list of ansatz parameters. But notice that the integral is constrained by Eqs. (56)- (57). Therefore, the initially postulated fc 1 ; c 2 ; …g parameters are not independent if a separate set of collinear ffs is known and available. At least one of the parameters is fixed in terms of the other parameters and the collinear functions. In the example from Sec. VII, this corresponds to the step where we replaced the C ðd c Þ in Eq. (83) by the right-hand side of Eq. (87). Without that step, the parameters of g ðn;d MS Þ A ðz A ; b T Þ are underconstrained, and the model of the nonperturbative transverse momentum will generally be inconsistent with the collinear correlation functions. 4 To our knowledge, such a step is not explicitly performed in the top-down implementations of TMD factorization and evolution.
These constraints also mean that the nonperturbative g functions are strongly correlated with the collinear functions, consistent with observed phenomenology [107].
Before closing this section, we emphasize once more that the steps above involving b Ã are not strictly necessary if one FIG. 12. A comparison of the cross section W ð1Þ ðq T ; QÞ calculated using Eqs. (114) and (121), with the example models from Sec. VII, at scales of Q ¼ 2 GeV (top panel) and Q ¼ 10 GeV (bottom panel), and with fixed values of intrinsic masses M, m D , and m K as indicated in the labels. In each case, the solid lines implement the cross section in the usual notation of the CSS formalism, i.e., through the use of Eq. (114) and g functions defined in terms of our model examples, as in Eqs. (115) and (116). The dashed lines are obtained by the use of Eq. (121), with the same g as solid lines but with Eq. (121). Note that Eq. (114) is identical to Eq. (65) and is thus independent of b max by construction. 4 Alternatively, one could in principle leave the TMD ff model unconstrained by Eq. (56) and instead calculate all collinear ffs directly in terms of the parametrization for the TMD ff. This may be impractical, however, given the existing extensive phenomenology that constrains collinear functions as compared to the TMD ones. has already adopted the bottom-up steps of Sec. VI and implemented Eq. (65), but they may be helpful in comparing with past results.

X. CONCLUSION
In this paper, we have explained the advantages of a bottom-up/forward-evolution style of treating TMD correlation functions modeled or calculated nonperturbatively at a moderate input scale. Before concluding, let us reemphasize that we have not made any fundamental modifications to standard TMD factorization and evolution-indeed, see Eq. (121). Rather, our purpose has been to make explicit the steps for combining any choice of nonperturbative model or parametrization of transverse momentum dependence with full TMD evolution in a reasonably automated way. This did lead to a significant amount of reorganization of previously known results, but we hope that the final prescription is fairly straightforward. Overall, our discussion in this paper might be summarized as a prescription for constructing the "g functions" of typical CSS implementations in a way that conforms simultaneously with the TMD parton model viewpoint and TMD evolution.
The phenomenological strategy is to take TMDs generated from models, nonperturbative calculations, and/or fits near an input scale Q 0 , and use these to make predictions for higher Q measurements. Based on how successful those predictions are, models and/or fit parameters can be updated. The expected trend is that the amount of parameter adjustment diminishes as larger Q measurements are included.
Future steps involve constructing nonperturbative parametrizations analogous to our example functions in Sec. VII, but with more sophisticated input from nonperturbative QCD, including the models discussed in the Introduction. It is likely (as was our intent) that many existing "Type I" phenomenological results can be combined with our approach with minimal modifications. Using the steps from Sec. VI to implement evolution with specific models such as the spectator or bag models is a natural next task. Of course, all expressions need to be extended to all flavors and channels and to TMD pdfs. Other minor tasks still to be completed include extending to higher n (forK this is simple), including the singlet gluon channel, and extending it to a treatment of gluon TMDs [30].
Over the past several years, there has been significant progress also in lattice-based methods for calculating TMD functions [108][109][110][111][112][113][114][115]. The techniques discussed in this paper will also be important for connecting those developments to experimental data and phenomenological extractions.
We have focused on discussing unpolarized cross sections, but the same steps carry over in a straightforward way to polarized processes. A point of caution is that the integral relations analogous to Eq. (30) become more subtle in some spin dependent TMD functions, and the translation between correlation functions defined with cutoffs and in other schemes is not as straightforward [116,117].
Nonperturbative g functions exactly analogous to Eq. (99) appear in the TMD factorization formula for double scattering with double parton TMDs-see Eq. (6.31) of [118]. Therefore, techniques analogous to what we have described for single scattering should in principle apply there as well.
An application where our procedure also likely helps, but which we have not yet discussed in detail, is in the matching to large transverse momentum. The so-called "asymptotic" term discussed in, for example, Ref. [101] is the large k T asymptote of Eq. (21), and it is an important ingredient for matching to the fixed order collinear calculation at large q T ≈ Q. In top-down/backward evolution approaches, errors that grow larger as Q decreases tend to spoil reasonable agreement between direct calculations of Eq. (21) and the asymptotic term [119]. By contrast, in the bottom-up approach we have laid out in this paper, the matching to the asymptotic term at Q ≈ Q 0 is automatic by construction. A separate but related issue is the difficulty observed in some processes, especially at moderate Q, of explaining q T ≈ Q using standard fixed order collinear factorization and existing collinear pdfs and ffs [120][121][122][123].
In large Q measurements, there is reason to expect that nonperturbative transverse momentum eventually becomes phenomenologically irrelevant. It can remain important, however, when very high precision is a goal, such as measurements of vector boson masses [124][125][126][127][128].
We leave all of these considerations to future work. μ: A generic scale.
See below forQ 0 ðb T Þ, and see also Appendix C. Definition: collinear fragmentation function for a quark with momentum fraction z A fragmenting to hadron A. (8) d A;r ðz A ; μ Q Þ Definition: A more precise notation for d A ðz A ; μ Q Þ. Collinear fragmentation function with UV divergences handled in the r renormalization and/or regularization schemes. (For example, d MS is the standard collinear fragmentation function defined in the MS scheme.) The auxiliary scale μ is the renormalization scale, and in the above we have already replaced it with μ ¼ μ Q ¼ C 2 Q. Throughout this paper, we always assume C 2 ¼ 1. (9) D A ðz A ; z A k AT ; μ Q ; Q 2 Þ Definition: The TMD quark ff for a quark with transverse momentum k AT fragmenting to hadron A. The auxiliary parameters associated with evolution are μ and ζ, and in the above they have already been replaced by μ ¼ μ Q and ζ ¼ μ 2 Q ¼ Q 2 . We use different symbols for μ and ζ to keep derivatives clear. (10) D ðn;d r Þ A ðz A ; z A k AT ; μ Q ; Q 2 Þ A calculation of a TMD quark ff in perturbative nth-order collinear factorization optimized for k T ≈ Q and using collinear ffs d r defined in the r scheme.  input ðz; zk T ; μ Q 0 ; Q 2 0 Þ The input parametrization of the TMD ff with a nonperturbative parametrization for small k T and interpolating to an nth-order perturbative calculation at k T ∼ Q 0 . Applicable to phenomenology at Q ≈ Q 0 where 0 ≤ k T ≲ Q 0 is the relevant kinematical region. The perturbative part uses d r collinear ffs in the r renormalization/regularization scheme. See Eq. (52).

APPENDIX B: SCALE SETTING
The expansion in m=Q that gives the TMD factorization in Eq. (14) as its leading power must eventually break down as Q becomes small. In practice, this means one must choose a minimum Q 0 below which one stops trusting factorization in phenomenological applications, and this is what we have called the input scale Q 0 . Values of μ near Q 0 are understood to be close to the boundary where perturbative expansions in α s ðμÞ and power expansions in m=Q 0 cease to be valid even approximately. For the purposes of the present paper, Q 0 may be treated as a phenomenologically determined number. In practice it is usually taken to be somewhere in the range of 1-3 GeV.
The value of Q 0 fixes the overall RG (μ) and rapidity (ζ) scales in the input TMD ffs in Eq. (21). There is nothing incorrect in principle with working only with TMD factorization in the form that is written in Eqs. (14) and (20). One may regardD A ðz A ; b T ; μ Q 0 ; Q 2 0 Þ,D B ðz B ; b T ; μ Q 0 ; Q 2 0 Þ, and Kðb T ; μ Q 0 Þ as entirely unknown functions to be determined phenomenologically or by other nonperturbative means. This is essentially the TMD parton model and is a standard approach to much phenomenology applied to nucleon structure studies near Q ≈ Q 0 . However, it leaves the TMD ffs very badly constrained at large k T where perturbative calculations should be possible. When k A;BT ≈ Q 0 in Eq. (14), or equivalently when b T ≈ 1=Q 0 in Eq. (20), it is possible to expand the TMD functions perturbatively in collinear factorization and thereby increase predictive power. When k A;BT is much larger than Q 0 , one can further improve perturbative calculations by choosing k A;BT itself to set the hard scale in the individual TMD ffs rather than Q 0 . Of course, when k A;BT is small, it is not appropriate as a hard scale, and one should instead continue to use Q 0 . Indeed, it is the nonperturbative k T dependence that is often of primary interest in TMD hadron structure studies.
In this appendix, we elaborate on the issue of the transition between the choices of Q 0 and k T as scales for the TMD ffs, keeping in mind that the ultimate goal is to smoothly transition between traditional Type I and Type II styles of approach. The discussion will be somewhat schematic and is meant to further motivate the main body of the text. Also, we will work in transverse momentum space where our main points will be somewhat more intuitive, but the discussion applies equally in transverse coordinate space, which is more common.
To see the issues clearly, recall that when k T is large we may express the k T dependence in a single input TMD ff in the form where δ is some function of the coupling and the ratios of the scales that generally appear in logarithms. When k T ≫ Q 0 , the k T =μ Q 0 ratio on the right-hand side of Eq. (B1) diverges so that perturbative calculations of δ with fixed μ Q 0 degrade in accuracy. Optimizing δ in perturbation theory requires another application of evolution from μ Q 0 to μ k T ¼ C k k T and Q 2 0 to μ 2 k T ¼ C 2 k k 2 T where C k is an order unity numerical constant analogous to C 1 and C 2 in the main body of the text. After evolution, the TMD ff that we work with instead has the general behavior Now as k T =Q 0 → ∞ the convergence properties of a perturbative expansion of δ only improves as the coupling and power suppressed terms vanish. Most of the standard Type II/top-down styles of TMD factorization implementations thus only use a scale choice analogous to Eq. (B2) (or rather its coordinate space analog) and rarely the fixed scale version in Eq. (B1). However, small-α s calculations of δ in Eq. (B2) come with numerically unstable truncation errors in the region around k T ≈ Q 0 simply because Q 0 is at the border between perturbative and nonperturbative scales. Small variations in, for example, the value of C k can have a big effect on calculations. The problem is exacerbated by the fact that the transformation from μ Q 0 , Q 2 0 to μ k T , μ 2 k T effectively involves a resummation of many higher order logarithms in the region of k A;BT . That is, what appears in the factorization after the scale transformation is not Eq. (B2) alone but ∼ Dðz; zk T ; μ k T ; μ 2 k T Þ 1 þ X terms like α s ðμ k Þ m ln n μ k T μ Q 0 : However, when k T ≈ μ Q 0 the logarithms on the second line are anyway no larger than any other contributions that are higher order in α s ðk T Þ. 5 The advantage of the scale transformation is lost here. More problematically, because of the rapidly rising coupling α s ðμ k Þ just below k T ≈ Q 0 , the higher order logarithmic terms in Eq. (B3) are numerically unstable over the range of k T extending from just below to just above Q 0 . Thus, calculations that use Eq. (B2) are rather poorly behaved in the k T ≈ Q 0 borderline region where we should expect a reasonably smooth transition to a nonperturbative region. But calculations with a fixed scale can still be at least approximately valid around k T ≈ Q 0 where Eq. (B1) is As long as Q 0 is not very small, one can use low order perturbative calculations to approximate D. And there are no large logarithms in Eq. (B1) if k T ∼ Q 0 , so the advantages of changing scales to Eq. (B2) are absent. Therefore, our preferred choice of scale for the transition region around k T ≈ Q 0 is actually the fixed scale in Eq. (B1), not the k T ≫ Q 0 RG improved scale of Eq. (B2). Another advantage of using the fixed scale is that it automatically gives the fixed order asymptotic term in momentum space at Q ¼ Q 0 and q T ≈ Q 0 , which is also a fixed scale calculation in momentum space, when the TMD ff is substituted into Eq. (14).
The above suggests that it is best to categorize the k T dependence not into just two regions of small k T and large k T , but into three regions: small k T (k T ≪ Q 0 ), large k T (k T ≈ Q 0 ), and very large k T (k T ≫ Q 0 ). The large k T region is where k T is just barely large enough for small coupling descriptions of transverse momentum dependence to be reasonable.
In the discussion above we have worked in transverse momentum space because that better matches the intuition of models and phenomenology. However, the analogous observations apply straightforwardly to transverse coordinate space. In that case, there is a region of very large b T ≫ 1=Q 0 that is entirely nonperturbative, a region of very small b T ≪ 1=Q 0 that is purely perturbative as long as an RG improved μ ∼ 1=b T is used, and an intermediate region of b T ∼ 1=Q 0 where fixed order, fixed scale calculations are ideal.
One of our tasks in the main body of the paper is to interpolate between these three regions in our parametrizations. For the intermediate region of transverse momentum, transitioning between scales only introduces higher order errors.
Notice that our discussion of large logarithms above is a mirror image of how large logarithms are often introduced in explanations of top down approaches. There, one starts with Eq. (B1) but using μ Q and Q 2 instead of the input μ Q 0 and Q 2 0 and assuming Q ≫ Q 0 : Then the task is to determine how to resum large logarithms of lnðk T =μ Q Þ as k T gets small relative to Q, rather than as k T gets large relative to Q 0 .

APPENDIX C: SCALE TRANSFORMATION FUNCTION
For the scale transition function in Eq. (42), we must arrange for the transition from ∼1=b T to Q 0 to occur at b T somewhat smaller than 1=Q 0 to avoid modifying the treatment of Eq. (20) in the Q ≈ Q 0 region. One choice that satisfies this for a Q 0 ¼ 2 GeV is If we wish to adjust the exact shape in the ≈1=Q 0 transition region by adding a parameter as in Eq. (49), we may modify Eq. (C1) by introducing a parameter a, Q 0 ðb T ; aÞ Here the transition between the two RG scales takes place around b T ∼ 1=a. We can use the Eq. (C2) form to check approximate scale independence in the transition region by varying a slightly. C 1 is the usual numerical constant, 5 It is also worth recalling here that QCD perturbation series are only asymptotic rather than convergent.