An $S$-Matrix for Massless Particles

The traditional $S$-matrix does not exist for theories with massless particles, such as quantum electrodynamics. The difficulty in isolating asymptotic states manifests itself as infrared divergences at each order in perturbation theory. Building on insights from the literature on coherent states and factorization, we construct an $S$-matrix that is free of singularities order-by-order in perturbation theory. Factorization guarantees that the asymptotic evolution in gauge theories is universal, i.e. independent of the hard process. Although the hard $S$-matrix element is computed between well-defined few particle Fock states, dressed/coherent states can be seen to form as intermediate states in the calculation of hard $S$-matrix elements. We present a framework for the perturbative calculation of hard $S$-matrix elements combining Lorentz-covariant Feynman rules for the dressed-state scattering with time-ordered perturbation theory for the asymptotic evolution. With hard cutoffs on the asymptotic Hamiltonian, the cancellation of divergences can be seen explicitly. In dimensional regularization, where the hard cutoffs are replaced by a renormalization scale, the contribution from the asymptotic evolution produces scaleless integrals that vanish. A number of illustrative examples are given in QED, QCD, and $\mathcal{N}=4$ super-Yang-Mills theory.


Introduction
The scattering matrix, or S-matrix, is a fundamental object in physics. Intuitively, the S-matrix is meant to transform an "in" state ψ in ⟩ at t = −∞ into an "out" state ⟨ψ out at t = +∞. Unfortunately, constructing an operator in quantum field theory which achieves this projection is far from trivial. To begin, one might imagine that S = lim t→∞ e −iHt . However, this operator does not exist, even in a free theory. For example, acting on states with energies E i , matrix elements of this operator would be infinitely-oscillating phases. The proper resolution in quantum mechanics was first understood by Wheeler [1], who defined the S-matrix to project from a basis of metastable asymptotic states ψ in ⟩ (a nucleus) to other states (other nuclei) ψ out ⟩. This idea was expanded for use in quantum field theory by Heisenberg, Feynman, and Dyson [2][3][4] for calculations in quantum electrodynamics (QED). In modern language, one must factor out the evolution due to the free Hamiltonian H 0 to make S well-defined.
In the Wheeler-Heisenberg-Feynman-Dyson (henceforth "traditional") approach, one assumes that in the far past, the "in" state is well approximated with a freely evolving state, i.e. a state that evolves with the free Hamiltonian H 0 : e −iHt ψ⟩ → e −iH 0 t ψ in ⟩ as t → −∞. The interaction is assumed to occur during some finite time interval so that in the far future, the time evolution is again nearly free: e −iHt ψ⟩ → e −iH 0 t ψ out ⟩ as t → +∞. The state ψ⟩ is then related to the in and out states by Møller operators as ψ⟩ = Ω + ψ out ⟩ = Ω − ψ in ⟩ and so ψ out ⟩ = S ψ in ⟩ where the traditional S-matrix is defined as Unfortunately, this textbook approach has problems too: bare S-matrix elements computed this way are both ultraviolet (UV) and infrared (IR) divergent. 1 Ultraviolet divergences are by now completely understood: they are an artifact of computing S-matrix elements using unphysical fields in terms of unphysical (bare) parameters. When S-matrix elements are computed with physical, renormalized, fields in terms of physical observable parameters, the UV divergences disappear. IR divergences, however, are not as well understood and remain an active area of research. In theories with massless charged particles, such as QCD, S-matrix elements have IR divergences of both soft and collinear origin. Historically, three approaches have been explored to ameliorate the problem: the "cross section method", the "dressed-state method" and the "modification-of-S method". The first way of dealing with IR divergences, referred to as the cross section method (following [5,6]) is the most common. It argues that S-matrix elements themselves are not physical; only cross sections, determined by the squares of S-matrix elements integrated over sufficiently inclusive phase space regions, correspond to observables. Importantly, in this method, IR divergences cancel between virtual contributions and real emission contributions to different final states. The cancellation in QED was demonstrated definitively by Bloch and Nordsieck [7] in 1937. They showed that cross sections in QED (with massive fermions) are IR finite order-by-order in perturbation theory when processes with all possible numbers of final state photons with energies less than some cutoff δ are summed over. The proof of Bloch-Nordsieck cancellation [8][9][10] crucially relies on Abelian exponentiation [8]: the soft singularities at any given order in α in QED are given by the exponential of the 1-loop softsingularities. For theories with massless charged particles, such as QCD, Bloch-Nordsieck fails [11].
In non-Abelian gauge theories, the theorem of Kinoshita, Lee and Nauenberg (KLN) [12,13] is often invoked to establish IR finiteness. The KLN theorem states that for any given process a finite cross section can be obtained by summing over all possible initial and final states for processes whose energy E lies within some compact energy window around a reference energy E 0 , i.e. E −E 0 < δ for a given δ. In fact, the KLN theorem is weaker and its proof more complicated than required. First of all, energy is conserved, so the cancellation must occur without the energy window. Second of all, one does not need to sum over initial and final states; the sum over only final states for a fixed initial state will do, as will the sum over initial states for a fixed final state. This stronger version of the KLN theorem was proven recently by Frye et al. [14]. The proof is one line: for a given initial state, the probability of it becoming anything is 1, which is finite to all orders in perturbation theory. Importantly, both the KLN theorem and its stronger version by Frye et al. generically require the sum of diagrams to include the forward scattering contribution, which is usually excluded from a cross section definition. Unless they happen to be IR finite on their own, the forward scattering diagrams are crucial to achieve IR finiteness. Multiple illustrative examples can be found in [14]. If one wants the cross section to be finite when summing over only a restricted set of final states, insights beyond Block-Nordsieck, KLN, and Frye et al. are required, such as those coming from factorization (e.g. [15][16][17][18][19][20][21][22][23]).
In the second approach to remedy IR divergences, the dressed-state method, the S-matrix is defined in the traditional way, but it is evaluated between states ψ d ⟩ that are not the usual few-particle Fock states p 1 , ⋯, p n ⟩. One of the first proposals in this direction was by Chung [24], who argued that in QED one should replace single-particle electron states p⟩ with dressed states of the form p d ⟩ = e R p⟩ with R defined as where j is a photon polarization vector and a j † k is its corresponding creation operator. The idea behind this dressing is that the eikonal factors p ⋅ p ⋅ k give the real emission amplitude in the singular (soft) limit, which is then canceled by virtual contributions, so that ⟨p d 3 ⋯p d n S p d 1 p d 2 ⟩ is IR finite. The exponentiation of the eikonal interaction is the same mechanism (Abelian exponentiation) as invoked in the Bloch-Nordsieck cancellation. Indeed, the proof of the IR finiteness of these dressed states in QED is essentially the same as in the proof of the Bloch-Nordsieck theorem. This cloud of photons in the dressing has the same form as Glauber's coherent states [25] used in quantum optics (these are, roughly speaking, e R 0⟩), and so the dressed states in this case are commonly called coherent states.
While the coherent state approach is in some ways appealing, it has drawbacks. The main problem is that the IR divergences are just moved from the amplitudes to the states. That is, the coherent states themselves are IR divergent and therefore not normalizable elements of a Fock space (although they may be understood as living in a non-separable von Neumann space, as explained in a series of papers by Kibble [26][27][28][29]). The IR divergence problem is therefore still present in this construction; it has merely been moved from the Smatrix elements to the states of the theory. Additionally, generalizing beyond massive QED to theories like QCD with collinear divergences and color factors has remained elusive [30,31]. In particular, no prescription is given for how to go beyond the singular points (zero energy or exactly collinear). For example, the coherent states are sums over particles with different momenta, so they do not have well defined momenta themselves. Is momentum then conserved by the S-matrix in the coherent-state basis? How does one integrate over coherent states to produce an observable cross section? These problems are not commonly discussed in the literature. As far as we know, no one has explicitly computed an S-matrix element between coherent states. This defect gives the coherent-state literature a rather formal aspect.
The third approach to removing the IR divergences in scattering theory is to redefine the S-matrix rather than the states. That the traditional S-matrix inaccurately captures the asymptotic dynamics arises already in non-relativistic scattering of a charged particle off a Coulomb potential in non-relativistic quantum mechanics. The standard assumption that particles move freely at asymptotic times is not justified for non-square-integrable potentials, like the 1 r Coulomb potential, and leads to ill-defined S-matrix elements. In modern language, the S-matrix element for non-relativistic Coulomb scattering has the form We see that the leading term of order α, corresponding to the first Born approximation, is not problematic: except in the exactly forward limit, there are no divergences in the treelevel scattering process. The logarithmic IR divergence (showing up as a 1 IR pole in d = 4−2 dimensions) first appears in the second Born approximation, where it is seen to be purely imaginary. Moreover, the IR divergent part exponentiates (as do all IR divergences in QED), into the Coulomb phase. Thus, in non-relativistic quantum mechanics, one can apply the cross section ideology even without the inclusive phase space integrals: the cross section for the scattering of a single electron off a Coulomb potential is well defined. However, the S-matrix is not.
One of the first attempts to define an S-matrix for potentials that are not squareintegrable was made by Dollard [32] in 1971. He noted that when incoming momentum eigenstates are evolved to late times with the Coulomb interaction H = H 0 + α r , there is a residual logarithmic time dependence for large t: The intuition for this form is that at large t, the particle moves approximately on a classical trajectory with r = pt m , which gives the logarithmic dependence on t when integrated up to infinity. While the e −i p 2 2m t is removed by Wheeler's e iH 0 t factor, the other term is not and persists to generates the 1 IR divergences in the S-matrix. Dollard then proposed to replace to the e iH 0 t factor with a e iHas(t) factor, with H as (t) defined with exactly the logarithmic time dependence needed to cancel the time dependence in Eq. (5). He then showed the a modified S-matrix, defined with his asymptotic Hamiltonian replacing H 0 , exists for Coulomb scattering.
When the electron is relativistic, the IR divergence in the second Born approximation has real part that does not cancel at the cross section level. So first-quantized quantum mechanics is insufficient to produce an IR-finite cross section: QED is needed. Faddeev and Kulish [33] combined the aforementioned work of Chung in QED and Dollard's in nonrelativistic quantum mechanics. They observed that in QED, infrared divergences have both a real part (as Chung observed) and an imaginary part (the relativistic generalization of the Coulomb phase). These can be combined into a modified S-matrix of the form where corresponds to the Coulomb phase (compare to the ln t dependence in Dollard's form, Eq. (5)). The factor R is similar to Chung's in Eq. (3) but with a power-expanded phase, and annihilation operators included as well: where is the electron-number operator. Acting on states, it pulls out the direction p of each fermion and multiplies the contribution by 1 for electrons or -1 for positrons: ρ(p) q 1 ⋯q n ⟩ = ∑ ± (2π) 3 δ 3 (⃗ p − ⃗ q j ) q 1 ⋯q n ⟩. Faddeev and Kulish proceed to argue that S FK has finite matrix elements between coherent states in QED. They argued that one should include the phase factors in a redefinition of the S-matrix while including the e R factors in dressing the states. Although there are some suspicious orders-of-limit and signs in Faddeev and Kulish's paper (see [6]), we believe their construction is essentially valid. Indeed, one goal of our paper is to translate this classic work in QED to modern language. As we will show in Section 2.3, both the real and imaginary parts in the factor e iΦ(t−) e R are reproduced by the action of a single Wilson line.
In the 50 odd years since Faddeev and Kulish's work, there has been intermittent progress on generalizing the coherent state construction from QED to non-Abelian theories. Early work [30,34,35] focused on trying to use coherent states to salvage the Bloch-Nordsieck theorem, following the QCD counterexamples given by Doria et al. [11,36]. Although soft divergences in QCD do not exponentiate into a compact form as they do in QED [37,38], they still have a universal form and factorize off of the hard scattering [15,23]. Using this observation, it has been argued using a frequency-ordered formalism that dressed states can be constructed between with S-matrix elements in QCD are soft-finite [30,39]. Collinear divergences and the soft-collinear overlap in gauge theories were explored in [6,[40][41][42]. An explicit check of the dressed formalism was performed by Forde and Signer [43] who used explicit cutoffs to separate the regions and showed that the cross section for e + e − → jets can be reproduced at leading power at order α s through finite S-matrix elements. Ref. [42] argued that if soft-collinear factorization holds in QCD, then the dressed state formalism should allow one to construct a finite S-matrix in QCD to all orders. Collinear factorization was proven diagrammatically at large N a decade later [44] and a full proof of collinear factorization and soft/collinear factorization for QCD to all orders in perturbation theory was given in [22,23], inspired by [15-17, 19, 20, 45]. One goal of the current paper is to combine these various insights to provide, for the first time, an explicit construction of an IR-finite S-matrix for QCD.
In all of this literature, there are a number of unresolved issues. First, there are essentially no results about the finite parts of a finite S-matrix. Showing the cancellation of the IR singularities is one thing, but to evaluate S one needs to deal with complications of momentum conservation, cutoffs, UV divergences, and to actually be able to compute the resulting integrals. A prescription to determine the finite parts of the modified S-matrix is required if we are explore the S-matrix's properties. While some authors have suggested criteria such as that the dressed states should be gauge [46] or BRST invariant [47], or have asymptotic charges [48,49], or be compatable with decoherence [50,51], the necessity of these choices is unclear. Certainly nothing goes wrong at the level of cross sections if we proceed using the cross section method. After the finite part is fixed, one must further explain how to relate modified S-matrix elements to observables: what is the measure for integration over momenta in the von Neumann space of dressed states (if one goes that route)? To agree with data, the predictions had better reduce to what one calculates using the IR-divergent S, but how that will happen in any of the approaches to dressed states is rarely discussed. In this paper, we attempt to raise the bar for constructing a finite S-matrix by providing a motivated, calculable scheme, and give explicit expression for S-matrix elements and observables in a number of cases in QED, QCD, and N = 4 super Yang-Mills theory.
The organization of this paper is as follows. We start by motivating and defining a "hard" S-matrix in Section 2. We show how to get finite answers, and connect to the previous work on QED using dressed states in Section 2.1. In Section 2.2, we discuss how to compute observables and show that the same predictions for infrared-safe differential cross sections results from S H as from the traditional S. In Section 2.3 we connect our construction to the expressions of Faddeev and Kulish in QED. We then proceed to explicit calculations, working out the Feynman rules and some toy examples in Section 3. In Section 4 we demonstrate IR finiteness in the process γ ⋆ e − → e − in QED using cutoffs, and illustrate the relative simplicity when pure dimensional regularization is invoked. In Section 5 we discuss Z → e + e − including the connection to the Coulomb phase and the Glauber operator as well as an explicit calculation of the thrust distribution, both exactly at NLO and to the leading logarithmic level using the asymptotic interactions. Section 5.2 makes explicit some of the general observations about exclusive measurements from Section 2.2. Section 6 gives some examples in N = 4 super Yang-Mills theory, connecting to observations about remainder functions, renormalization and subtractions schemes. Concluding remarks and a summary of our main results are given in Section 7.

The hard S-matrix
The intuition behind scattering is that one starts with some initial state, usually well approximated as a superposition of momentum eigenstates, which then evolves with time into a region of spacetime where it interacts, and then a new state emerges. The S-matrix is meant to be a projection of this emergent final state on to a basis of momentum eigenstates. For scattering off a local (square-integrable) potential, this picture works fine. The S-matrix is then defined as S = Ω † + Ω − as in Eq. (2) with the Møller operators Ω ± defined in Eq. (1). However, when the interactions cannot be confined to a finite-volume interaction region, as in Coulomb scattering or in a quantum field theory with massless particles, this picture breaks down: the states at early and late times continue to interact, so the momentum-eigenstate approximation is no longer valid.
As mentioned in the introduction, the simplest example with the traditional definition of S breaks down is for non-relativistic scattering off a Coulomb potential. In this case, the Møller operators acting on momentum eigenstates generate an infrared divergent "Coulomb" phase. While the infrared divergence is a problem for a formal definition of the S-matrix, it is not a problem for cross section calculations that depend only on squares of S-matrix elements. In relativistic Coulomb scattering, or in QED, S has both an infrared divergent Coulomb phase and an infrared divergent real part. A convenient feature (Abelian exponentiation [8]) of QED is that a closed form expression is known for the IR-divergent contribution to all orders in perturbation theory for any process. Indeed, the 1-loop divergences are given by S ∼ γcusp IR where the cusp-anomalous dimension is (see [52]) with the cusp angle defined by  [53]. Thus, it is possible to factor out IR-divergent parts from the S-matrix and redefine a new S-matrix that is IR-finite order-by-order. This was done by Chung and Faddeev and Kulish, as discussed in the introduction. Note that the non-relativistic limit corresponds to β → 0 in which case γ cusp = iα 1 β becomes the purely imaginary Coulomb phase.
When the charged particles are also massless, as in QED with m e = 0, new IR divergences appear associated with collinear divergences. Soft-collinear divergences appear as double IRpoles. Indeed, in the m e → 0 limit, v µ i becomes lightlike, so β → ∞. At large β in the cusp angle γ cusp ∼ − α π β diverges linearly with β, so the S-matrix now has double, 1 2 IR poles. In QCD, or other non-Abelian theories, the cusp angle gets corrections beyond one loop and the IR divergences do not exponentiate into a closed form expression [37,38,54]. These complications have made it difficult to come up with a complete formulation of an IR-finite S-matrix in general quantum field theories [6,42,43].
The approach we take in this paper is to construct an S-matrix that is IR finite by replacing the free Hamiltonian H 0 in the definition of the traditional S-matrix with an appropriate asymptotic Hamiltonian H as . That is, we can define new hard Møller operators and a hard S-matrix as Ideally, we would want to choose H as so that the hard Møller operators exist, as unitary operators on the Hilbert space. Proving their existence is challenging, as even in a massgapped theory, where we can take H as = H 0 , they do not exist by Haag's theorem [55]. From a practical point of view, we can be less ambitious and aim to choose H as so that the hard S-matrix is free of IR divergences at each order in perturbation theory. If this was our only criteria, we could choose H as = H, so that S H = 1.
A better criteria for defining H as is that, in addition to capturing long-distance interactions, the asymptotic Hamiltonian should be defined so that the asymptotic evolution of the states is independent of how they scatter. It is possible to define H as this way due to universality of infrared divergences in gauge theories. Using factorization [15][16][17][18][19][20][21][22][23], the soft and collinear interactions can be separated from the hard scattering process: Any S-matrix element in gauge theories can be reproduced by the product of a hard factor, collinear factors for each relevant direction, and a single soft factor. See [23] for a concise statement of factorization at the amplitude level.
In order to exploit factorization, we employ methods developed in Soft-Collinear Effective Theory (SCET). The theory provides a systematic power expansion of the QED or QCD Lagrangian, and reproduces all infrared effects. The leading power Lagrangian in SCET is [56,57] (13) where s and c, n are soft and collinear labels respectively and the collinear covariant derivative is The last term L Glauber describes Coulomb or Glauber gluon interactions [58] (see also [59]). Pedagogical introductions to SCET can be found in [56,57,60]. We define the asymptotic Hamiltonian H as to be the SCET Hamiltonian appended with free Hamiltonians for massive particles. The hard S-matrix is then defined in terms of H as using Eqs. (11) and (12).
Although the SCET Lagrangian looks complicated and non-local, much of the complication comes from being careful to include only leading-power interactions. In principle, for a theory to be valid at leading power, one could include any subleading power interactions one wants. Exploiting this flexibility, the collinear interactions in L SCET can be replaced simply with the full interactions of QCD: iψ c n D c ψ c n . The soft interactions, from theψ c n n 2 n⋅A a s (x − )ψ c n term, are also not that complicated: they are equivalent to treating the collinear fermions as being infinitely energetic, with no recoil. That is, the fermions act as classical sources for radiation moving in a straight line along the n µ direction. This leads to an alternative representation the soft interactions as coming from Wilson lines. This connection is made more precise in Section 2.3.
In practice, when computing S H elements we will not use the explicit and cumbersome interactions in L SCET . Instead, we will take the method-of-regions approach [57,61]. We start with a particular Feynman diagram and then expand to leading power based on the collinear or soft scaling associated with particles involved. In a sense, this is the most straightforward and foolproof way to compute S H amplitudes. Numerous examples are given in subsequent sections.
We also, in accord with the general principles of the method of regions, do not impose any hard cutoffs on the momenta of the soft and collinear particles that interact through H as . Imposing cutoffs is helpful for demonstrating explicit IR-divergence cancellation, and some examples are provided in Section 4.1. However, cutoffs generally lead to very difficult integrals, and moreover they break symmetries like gauge-invariance that we would like S H to respect. More precisely, it is only the finite, cutoff-dependent remainder terms that may depend on gauge -the IR divergence cancellation mechanism is gauge-independent. Since the cutoff-dependent finite parts are unphysical anyway, it is not a problem that they are also gauge-dependent. In general, however, the whole framework with cutoffs is rather unwieldy.
When using pure dimensional regularization, the diagrams involving H as interactions will lead to scaleless integrals. These integrals are both UV and IR divergent. The IR divergences cancel in other contributions to S H (as we will provide ample demonstration), but the UV divergences must be removed through renormalization. As a consequence, in pure dimensional regularization, S H -matrix elements are not guaranteed to be independent of renormalization scheme. Indeed, they are generally complex and will depend on the scale µ at which renormalization is performed. The S H -matrix is not scale independent: d dµ S H ≠ 0, in contrast to S which does satisfy the Callan-Symanzik equation d dµ S = 0. This is unsatisfying, but not unsettling, as S H elements are not themselves observable. (To be fair, if S-matrix elements are IR divergent, it is not clear what it means to say they are scale-independent).
In any case, one should think of S H (µ) like one thinks about the strong coupling constant α s (µ) in MS. While α s (µ) is not observable, it is still an extraordinarily useful concept. The running coupling indeed encodes qualitatively and quantitatively a lot of important physics, such as unification and confinement. As with α s (µ), when S H (µ) is used to compute an observable, the scale dependence will cancel. We demonstrate that in general in Section 2.2, and provide an explicit example in Section 5.

S H and dressed states
The usual way of calculating S-matrix elements in perturbation theory is to work in the interaction picture, where one expands the interactions in terms of freely evolving fields. The propagators for free fields have a relatively simple form, and S-matrix elements then become integrals over these propagators. One might try to work out Feynman rules for S H analogously, in an asymptotic interaction picture. Then propagators would correspond to non-perturbative Green's functions for the soft and collinear fields in L SCET , including all of their interactions. Unfortunately, finding a closed-form expression for these propagators is not possible. In any case, it is not necessary, since if we want to work perturbatively in the coupling constants, we must do so consistently in both H and H as .
To proceed, we note that the hard S-matrix can be written suggestively as where are asymptotic Møller operators and Ω ± = lim t→±∞ e iHt e −iH 0 t are the usual Møller operators. Inserting complete sets of states lets us write hard S-matrix elements between a Heisenberg picture out-state ψ out ⟩ and a Heisenberg picture in-state ψ in ⟩ as Here the integral is over complete sets of Fock-space states ψ ′ in ⟩ and ψ ′ out ⟩. The hard scattering matrix elements are written as a product of three terms. The middle term is the traditional S-matrix and the outer terms correspond to evolution with the asymptotic Møller operators. The Feynman rules for these contributions closely resemble those of time-ordered perturbation theory and are derived in Section 3.1 below.
Another interpretation of the hard matrix elements can be obtained by defining dressed states as Then, i.e. the matrix elements of the hard S-matrix are equivalent to matrix elements of the traditional S-matrix between dressed states. This connection was made in the context of QED in [6]. The role of the asymptotic evolution can then be viewed as transforming the in-state defined at t = 0 into a dressed state at t = −∞ that scatters in the traditional way (with S). The role of dressed states is illustrated in Figure 1.
The dressed states ψ d in ⟩ and ψ d out ⟩ are not normalizable elements of the Fock space that ψ in ⟩ and ψ out ⟩ live in. Indeed, if we expand them perturbatively their coefficients in the Fock space basis contain infrared divergent integrals. For example, starting with an e + e − ⟩ state in QED, the asymptotic Møller operator can add or remove soft photons with each factor of the coupling e. Up to order O(e 2 ) the dressed state will be a superposition of the leading order e + e − ⟩ state, e + e − γ⟩ states and e + e − γγ⟩ Fock states. Explicitly, Let us make a few observations about these dressed states. First, note that the Fock states being added have different 3-momenta. When k has exactly zero momentum (the case almost exclusively considered in the literature), momentum is conserved. But if one really wants to take these dressed states seriously, k must be allowed to have finite energy too, and then ψ d in ⟩ is not a momentum eigenstate. Second, the coefficient at order e 2 is a UV and IR divergent integral. The IR divergence is expected; it is exactly the IR divergence that cancels the IR divergence in elements of S to make elements of S H IR finite. Nevertheless, it makes ψ d in ⟩ hard to deal with as a state. The divergence requires an excursion from the traditional Fock space to a von Neumann space [26][27][28][29]. The UV divergence is due to the fact a soft momentum is not sensitive to any hard scale in the problem, so there is no natural cutoff on the k integrals. One could, of course, put in explicit hard cutoffs on the soft momenta, however, it is easier to simply renormalize the UV divergence by rescaling ψ d in ⟩. Third, it is not each separate electron that is being dressed. Rather it is the combination. Indeed, the IR divergence in the example above comes from loops connecting the two electrons. These loops are critical to cancelling the IR divergences in S H . In Chung's original formulation (cf. Eq. (3)), a picture can be sketched for a coherent state as an electron moving with a cloud of photons around it. But this picture is too naive: the cloud depends on all the charged particles. This is even clearer in QCD, where the soft factors come with non-Abelian color matrices so one cannot rely on the crutch of Abelian exponentiation to move the dressing factors from state to state at will. A discussion of additional complications in QCD and the failure of Bloch-Nordsieck mechanism, can be found in [30].
In conclusion, although the dressed state picture fits in naturally with the construction of S H we have presented, we doubt that thinking of the dressed states as physical states will ultimately be profitable.
We emphasize that for the purpose of having finite matrix elements, neither the in-and out-states ψ in ⟩ and ψ out ⟩, nor the dressed states ψ d in ⟩ and ψ d out ⟩, need to be eigenstates of the asymptotic Hamiltonian.
In the examples to follow we will take ψ in ⟩ and ψ out ⟩ to be eigenstates of the free momentum operator P µ 0 with a finite number of particles, but in principle they can be taken to be any sensible linear combination of states in the relevant Hilbert space, i.e. with finite coefficients, in contrast to the usual coherent states which are an infinite linear superposition of Fock state elements. The S H -matrix elements between any such states are always finite.

Computing observables using S H
To compute an observable using S H , one must specify what is to be included in the measurement and what is not. As a concrete example, consider computing the inclusive decay rate of the Z boson in perturbation theory. Since the Z does not couple to massless gauge bosons, it has no interactions in H as and therefore Ω as ± Z⟩ = Z⟩. The rate is then (up to kinematic factors) The sum is over all states in the theory except the Z itself, since Z → Z does not contribute to the rate and includes an implicit integral over the phase space for X⟩. Now we write where Ω as † + Z⟩ = Z⟩ was used in the last step. So the sum over final states gives the same decay rate using S H as it would using S. The key here was that there are no asymptotic interactions for Z. If there were, then the derivation would not hold. But in that case, the Z → Z forward scattering amplitude would be infrared divergent using S so it is not clear what physical result we should expect.
Suppose we wanted to compute something less inclusive than the total decay rate. The observable has to be infrared safe. For example, we could consider a 2-jet rate in e + e − → hadrons. Such a rate depends on the jet definition, which depends on exactly how the soft and collinear momenta are handled. In other words, it depends not only on the hard process, which is roughly speaking the jet-production amplitude, but also on the evolution of the jets after the hard scattering occurs. For this evolution, we need to include the dynamics induced by e −iHast+ ≡ lim t→∞ e −iHast , as the state evolves from t = 0 to t = ∞ after the hard scattering. That is, we should define our exclusive cross section as Here N jets (X) is the measurement function which takes as input the momenta of the particles in the final state X and returns the number of jets according to some jet definition. The factor ⟨Y S H Z⟩ gives the amplitude to produce the jets and ⟨X e −iHast+ Y ⟩ gives the amplitude for those jets to evolve into a state with the particles in X⟩ at asymptotic times. The sum over Y can be as restrictive as desired. For example, if Y is taken to be only qq⟩ quark-antiquark states, the distribution will be valid to leading power. To get the jet mass distribution exactly right, including subleading power effects, one should extend the sum from over qq⟩ states to anything that could possibly evolve into a state X with N jets (X) = 2. For example, qqg⟩ should be included. If all states are allowed then one can replace ∑ Y Y ⟩⟨Y with 1. In that case, the rate reduces to The e iH 0 t+ factor generates a phase e iE X t which is constant for all X by energy conservation and therefore drops out of the absolute value. Thereby the exclusive cross section reduces to the same thing one would compute using S (in agreement with a century of theory/experiment comparisons). A cartoon of the reduction of the cross section to the one computed with S for this process is shown in Fig. 2. Just because one can reduce cross section calculations using S H to those using S, does not mean one should. Additional physical insight is gained by maintaining the separation into a calculation of S H first and then of the evolution using e −iHast+ or equivalently Ω + as . In particular, since H as is independent of the hard scattering, the separation leads to the physical picture of a short-distance amplitude for jet production followed by an evolution from short-to-long distances where the jets are resolved into their constituents. For example, in the computation of thrust in e + e − events, when the events comprise pencil-like jets, the structure of the distribution is almost completely determined by the asymptotic evolution alone. This example, and the utility of the separation will be discussed more in Section 5.  Figure 2: An observable is computed by integrating the square of an amplitude against a measurement function, inserted at t = ∞. In computing an exclusive observable sensitive to the asymptotic dynamics, one must evolve the dressed states to +∞ using the asymptotic Hamiltonian. The example Z → jets is illustrated on the left. The result is equivalent to evolving the initial state Z⟩ at t = −∞ with the full Hamiltonian to the set of states X⟩ on which the measurement is performed at t = +∞ (right).
The above discussion of observables also helps clarify how one should think of assigning hard or soft/collinear labels to the particles in the states. Consider, for example, the process Z →qqg. In what circumstances should one consider the gluon momentum to be collinear to the quark or antiquark momenta, or soft?
On the one hand, if one declares the gluon momentum to be soft or collinear, then there are necessarily interactions in H as that can produce the gluon through a real emission. Due to factorization, the amplitude for this emission from H as will approach that from H, but with opposite sign. So the two will cancel in the exact soft/collinear limits. In other words, if the gluon momentum is soft/collinear, then the hard matrix element ⟨qqg S H Z⟩ will vanish in soft/collinear limits. In this case, there is also a contribution to aqqg final state from the hardqq production ⟨qq S H Z⟩ and then an emission of g though the asymptotic interactions. This additional contribution is not power suppressed and adds to the ⟨qqg S H Z⟩ amplitude to produce the full distribution, in agreement with ⟨qqg S Z⟩. Such a deconstruction corresponds to the picture of matching onto a 2-jet operator C 2 O 2 and then matching on to a 3-jet operator C 3 O 3 in SCET. [62,63]. In such matching, the Wilson coefficient C 3 vanishes in soft and collinear limits.
On the other hand, it does not really make sense to compute ⟨qqg S H Z⟩ when the gluon is soft or collinear. The hard S-matrix is meant to give amplitudes for production of hard particles. The evolution of those hard particles into jets with soft/collinear substructure is subsequently determined by H as . Thus, a more sensible convention is to consider only matrix elements ⟨qqg S H Z⟩ when all 3 final state particles are considered hard. In this case, these particles have no interactions with each other in H as and there are no contributions to ⟨qqg S H Z⟩ that have real emissions from the asymptotic region. Thus, all the contributions to S H involving the asymptotic region are virtual (and give scaleless integrals in pure dimensional regularization). In other words, if one is interested in 3-jet production, one should study ⟨qqg S H Z⟩ and if one is interested in 2-jet production, one should study ⟨qq S H Z⟩.
Although the final predictions for IR-safe differential cross sections are independent of what convention we take for assigning labels to the final state particles (and always agree with the result from S), the hard S-matrix should always be thought of as giving the amplitudes for producing hard particles. With this convention ⟨qqg S H Z⟩ no longer vanishes in soft or collinear limits. Instead in these limits, it factorizes into ⟨qqg e −iHast+ qq⟩⟨qq S H Z⟩. Since the splitting amplitudes ⟨qqg e −iHast+ qq⟩ are universal [22,23,44], this restricts the possible form that ⟨qqg S H Z⟩ could have. Implications of these restrictions have been discussed extensively (see [64,65]) and are one instance of the deep structure present in S H -matrix elements.
In summary, one has two choices • Allow states in which S H matrix elements are taken to have soft or collinear momenta.
Observables computed this way will only be valid to leading power, but can be computed efficiently exploiting factorization.
• Insist that all states in which S H matrix elements are taken have only hard momenta. Then all the contributions from the asymptotic regions are virtual, and scaleless in dimensional regularization. Observables agree exactly with their computation using S.
We emphasize that with either choice, S H matrix are IR finite. The general observations in this section are backed up with explicit calculations in Section 5.2.

Soft Wilson lines
To connect our framework to previous work, we consider the QED case with massive electrons. In this case, there are only soft interactions in the asymptotic Hamiltonian. The interaction in the SCET Hamiltonian between soft photons and collinear fermions has the form (see Eq. (13)) where n µ is a lightlike 4-vector labeling the fermion,n µ is the direction backwards to n µ , and x − =n ⋅ x. For simplicity, we take n µ = (1, 0, 0, 1) son µ = (1, 0, 0, −1) and x − = t + z. The dependence of the interaction only on x − follows from the multipole expansion 2 . The collinear fields have only half the degrees of freedom of fields in QED: they only describe electrons in this case, as pair-creation is power-suppressed. So we can write The field expansion for the soft photon is as usual, but the phase is power expanded Inserting these field expansions and integrating over d 3 x gives Since k + ≪ p z after doing the q integral, we can replace a † q ≅ a † p at leading power and write Power expanding the energy ω q gives and hence the argument of the exponential becomes Then we find that the asymptotic Møller operator acting on a single electron state gives with P a path-ordered product. The path ordering is actually superfluous in QED, but is important in the non-Abelian case. The soft Wilson line in QED is defined as where the factor e −εs ensures convergence near s = ∞. Then, action of the the asymptotic soft Møller operator is the same as that of a product of soft Wilson lines For antiparticles, one would have Y n factors instead, and for incoming particles, one would have factors of Y n , defined as Y † n but with an integral from −∞ to 0 [22]. We can combine the time-ordered product of exponential into a single exponential using the Magnus expansion [66], where the higher order terms are sums of nested commutators. The commutators of two fields in Feynman gauge can be computed directly from the field expansions in Eq. (28) Since the commutator in Eq. (37) is a c-number, additional commutators vanish. This is the essence of Abelian exponentiation. Then, we can combine all the time-ordered exponentials into a single exponential: where When acting on states with electrons, this combination is exactly of the form e R e iΦ that Faddeev and Kulish write (see Eq. (6) To see the connection to the Coulomb phase, let us do the integrations over s and t in Eq. (39) using Eq. (37) Taking n 1 = (1, 0, 0, 1) and n 2 = (1, 0, 0, −1) we can simplify this to This is the usual divergent integral appearing in the Coulomb phase (cf. Eq. (108)). When one of the electrons is incoming, the ∫ which is consistent with the Coulomb phase vanishing for timelike kinematics.
In this way, we have shown that our framework agrees with previous work in the case of QED, where there are soft but not collinear singularities and the gauge boson is Abelian. Note that both the Coulomb phase and the real part of the exponent emerge from the single soft-collinear interaction in H as .
In the non-Abelian case, one cannot combine the path-ordered exponentials into the exponential of a single closed-form expression as in Eq. (38): the gauge generators do not commute. There is an analog of Abelian exponentiation, called non-Abelian exponentiation [37,38,54] but one must include higher order commutators, and no closed form expression is known. Thus, a Faddeev-Kulish type formulation of the dressed states is impossible for QCD. The Wilson-line description of the soft interactions is still valid, however, and the soft interactions in QCD still factorize off of the scattering operator into soft Wilson lines.

Computing the hard S-matrix
In this section, we show how to compute S H -matrix elements perturbatively. We will use the formula in Eq. (17): We call the two outer matrix elements the asymptotic region and the part involving ⟨ψ ′ out S ψ ′ in ⟩ the central region. The asymptotic regions go from 0 > t > −∞ and ∞ > t > 0, both backward in time. The central region calculation is just that of an ordinary S-matrix. A cartoon of the division is shown in Fig 3. In this section we establish the Feynman rules for the asymptotic regions, which are similar to those in old-fashioned, time-ordered perturbation theory with a few changes. We also give an example calculation in φ 3 theory that clarifies some of the subtleties. Calculations for physical process in QED, QCD and N = 4 SYM theories are given in subsequent sections.

Asymptotic region Feynman rules
We have reduced the problem of computing matrix elements of S H to calculating matrix elements of S and matrix elements of the form in perturbation theory. To evaluate these matrix elements, we separate the asymptotic Hamiltonian into a free part and an interaction part Defining the operator U as + (t) by the equation Ω as + = lim t→∞ U as + (t), it satisfies the differential equation −i∂ t U as asymptotic region Figure 3: In order to facilitate calculations in perturbation theory, we divide the matrix elements of S H into three parts. In the two outer parts, the asymptotic evolution Møller operators Ω as ± work to dress the in-and out-states. The middle part corresponds to a calculation of traditional S-matrix elements between dressed states.
where the superscript I indicates that V I as is the interaction picture potential, i.e. the asymptotic potential V as [φ 0 ] = − ∫ d 3 xL as [φ 0 ] expressed in terms of freely-evolving interaction picture fields φ 0 , and where L as is the Lagrangian density corresponding to the asymptotic interactions. This differential equation has the solution U as where T denotes an anti time-ordered product.
To see how to evaluate matrix elements of this operator, consider the following diagram in scalar φ 3 theory: The free fields are given by One-particle states in the free theory are: Up to renormalization, which will be discussed later, the external states are as usual taken to be free creation operators acting on the free vacuum. We therefore aim to calculate The second order term in g is: (53) Inserting Eq. (50) and commuting creation and annihilation operators, gives the following expression corresponding to the diagram above:

0⟩
(54) Integrating over ⃗ x and ⃗ y gives δ-function. Integrating over these δ-functions and the additional δ-functions coming from the creation and annihilation operators reduces the expression to (55) Finally, the integrals over t x and t y give More generally, the Feynman rules for the asymptotic regions are the same as those in ordinary relativistic time-ordered perturbation theory (see [67] for example) with two differences: 1) Since the outermost integral goes from 0 to ∞ instead of −∞ to ∞, the overall energy-conserving δ-function; and 2πδ(E f − E i ) is replaced by a propagator i E f −E i +iε 2) the evolution is backwards in time (e iHast instead of e −iHast ) so the whole amplitude is complex conjugated. This means ig → −ig and i E+iε → −i E−iε .
For explicit computations and consistency checks, one has to be very careful about the iε prescription. It is important to keep in mind that the propagators −i E−iε are distributions, only defined under integration. The iε comes from an integral representation of the θ function: so it really should be associated with the shift ω → ω − iε for any integral ending at t = +∞ or ω → ω + iε for any integral starting at t = −∞. When we have a nested integral, like Eq. (55), we get So each vertex gives another factor of ε. An example of the importance of careful treatment of these distributions is given in Section 3.2.
In summary, the Feynman rules for ⟨ψ out Ω as + ψ ′ out ⟩ are as follows • Draw all relevant time-ordered diagrams between the state ψ out ⟩ at t = 0 on the right and ψ ′ out ⟩ at t = ∞ on the left: • Assign momenta k µ i to each internal line, with k 0 i = ω k = m 2 + ⃗ k 2 i the on-shell energy. • Start at the far left of the diagram (t = ∞), and move a vertical cut rightwards time until a vertex is crossed. After each vertex is crossed, include a factor of where E cut = ∑ ω cut is the total energy of the particles in the cut, E ′ out = ∑ ω ′ out is the total energy of the particles in ψ ′ out ⟩, and n is the number of vertices that have already been crossed in the asymptotic region. Note that the −iε comes from a +iε from the t = +∞ region, and is then complex conjugated.
• For each vertex, add a factor of (2π) 3 δ 3 (∑ ⃗ p i ) to impose 3-momentum conservation and −ig for the interaction (or whatever the interaction is, just as in regular Feynman rules, complex-conjugated).
• Integrate over ∏ i ∫ d 3 k i (2π) 3 2ω i for the momentum of each internal line.

The Feynman rules for ⟨ψ ′
in Ω as − ψ in ⟩ are identical except that the diagrams go from t = −∞ on the right to t = 0 on the left where E ′ in = ∑ ω ′ in is the total energy of the particles in ψ ′ in ⟩.

Cross check in φ 3 theory
To validate the Feynman rules, consider the case where H as = H. In this case, the hard S-matrix is trivial S H = 1. Perturbatively, this means that diagrams with all vertices in the central region should be exactly canceled by diagrams involving vertices in the asymptotic regions. Moreover, the cancellation should occur for each time-ordered diagram on its own. We can check this cancellation in any theory and any diagram, so we take φ 3 theory with Lagrangian L = − 1 2 φ ◻ φ + g 3! φ 3 for simplicity and consider the diagram We sum over diagrams with t 1 and t 2 going from 0 to −∞ to ∞ and back to 0. Let us call the initial energy as ω i = ω p , the final energy ω f = ω p and the energy of the intermediate state ω c = ω p−k + ω k .
The usual time-ordered perturbation theory loop (i.e. the contribution from S to S H with all vertices in the central region) is To see this cancel other diagrams, it is helpful to break this diagram down further, into the contribution into 3 regions: first, −∞ < t 1 < t 2 < 0 then −∞ < t 1 < 0 < t 2 < ∞ and finally 0 < t 1 < t 2 < ∞: In this decomposition, we have employed the careful treatment of the distributions discussed around Eq. (58). Contributions from the loop in the asymptotic region are given by (66) and contributions from the loop divided between the two asymptotic regions is Lastly, there are contributions from diagrams with one vertex in the asymptotic region: Adding these contributions up, we find Similarly, all the contributions to the other time ordering of the diagram in Eq. (63) sum up to zero. Note that for the cancellation to occur, it was important to keep track of the distributional nature of the diagrams as encoded in the the factors of ε.

QED: Deep Inelastic Scattering
As a first real application, we consider the e − γ ⋆ → e − in QED with a massless electron. We call this deep inelastic scattering (DIS) in reference to the analogous process in QCD at the parton level, although obviously there is nothing inelastic about this scattering. We want to establish two facts about this process: that the hard S-matrix is IR-finite and what its value is. To compute the value for S H it is most sensible to use dimensional regularization. In dim reg, all the diagrams with interactions in the asymptotic region give scaleless integrals that formally vanish, so the bare S H -matrix element is determined by the S-matrix element alone. However, in pure dimensional regularization, it is difficult to separate UV from IR singularities. Therefore to check the cancellation of IR divergences, we use explicit cutoffs in the asymptotic regions.

S H using cutoffs on H as
In this section, we look at the diagram where a photon is exchanged between the two electron legs. The Feynman diagram in Feynman-'t Hooft gauge is given by [68] withμ 2 = 4πe −γ E µ 2 and M 0 = −eu f γ α u i the tree-level matrix element. To get a cancellation of the IR divergent terms, we need to add contributions to S H from graphs with vertices in the asymptotic regions. We would like to avoid the possible double counting of the soft and collinear degrees of freedom in H as . Working in pure dimensional regularization, the softcollinear overlap always gives scaleless integrals that vanish. Indeed, the method-of-regions approach is to simply discount the overlap region all together. If one works with regulators that separate the UV from IR, one can explicitly remove the overlap through a zero-bin subtraction procedure [69]. In SCET, this is done by computing the soft contribution and the collinear contribution then subtracting the soft-collinear overlap through a soft-collinear power expansion at the diagram level. If one formulates SCET in terms of operators with full theory fields, as in [23], the zero-bin subtraction appears as an operator-level subtraction.
In this section, we take the pragmatic approach of [23]: we exclude by hand the softcollinear region in H as . So we compute soft contributions from H as by power expanding in the soft limit than integrating photon momenta up to some ω max . We compute the collinear contributions by power expanding in the collinear limit and including only those photons with energy greater than ω max that are within θ max of one of the collinear directions. Similar calculations showing IR divergence cancellations for thrust and jet broadening can be found in [70].
To check IR divergence cancellations, we only need to look at a subset of time-ordered perturbation theory diagrams. For example, the diagrams or (71) are not IR divergent. Although these diagrams give finite contributions to S H , they do not need to be analyzed for the purposes of demonstrating IR finiteness. It is natural to work in the Breit or "brick-wall" frame, where the off-shell photon has no energy, q µ = (0, 0, 0, Q) and p i and p f are back to back. Defining θ as the angle between ⃗ k and Q, we have and If we were to impose overall energy conservation, then we would also have ω i = ω f = Q 2 . However, in time-ordered perturbation theory graphs involving vertices in the asymptotic regions, energy conservation is not guaranteed, so for those diagrams we leave ω i and ω f more general until energy conservation can be established. With these kinematics the phase space integral becomes where The graph with all the vertices in the central region is This graph is UV and IR divergent. But since this is the only IR-divergent time-ordering, we know its result must reproduce the IR divergences of the sum over all time orderings, i.e. the Feynman diagram in the full theory. So we can then read the IR divergences directly off of Eq. (70): The contribution with both interactions in an asymptotic region is given by soft photon exchange alone; there are no collinear photons that couple to both the incoming and outgoing electrons since these are back-to-back. Thus, we need to power expand the integrand in Eq. (70) at small ω k and restrict to ω k < ω max . Before power expanding, the time-ordered perturbation theory amplitude has the form Note that the overall energy-conserving δ-function δ(ω i − ω f ) from Eq. (70) is replaced with δ(ω i−k − ω f −k ). in Eq. (77), however at leading power the two δ-functions agree. In the soft limit, the energies of the intermediate electrons are and the numerators are expanded as Inserting the power expansion, the amplitude reduces to where x = cos θ. Performing the integrals gives The remaining two graphs are These have one vertex in the asymptotic region and one in the central region. In the first graph, the asymptotic vertex forces the exchanged photon to either be soft or collinear to the direction of the outgoing electron. In the second graph, the photon can be soft or collinear to the incoming electron. We must therefore power expand each in soft and collinear limits separately. Before doing any expansion the first graph is In the soft limit, this reduces to the same integral as in S B up to a sign flip since only one vertex is anti-time ordered. S D is similar. So we get These will cancel the double poles of S A + S B .
The graph S C has a collinear singularity when θ → 0. For the collinear graphs, as mentioned above, we consider collinear photons to be collinear but not soft, so they have energies ω k > ω max and angles 0 < θ < θ max . In the collinear limit, k ∥ p i , the energies expand to Since these expansions are only valid in the regime where the electron does not recoil against the photon, i.e. for ω k < ω i , we put ω i as an upper cutoff on the photon energy. The spinors in the numerator are on-shell, so in the collinear limit the numerator can be approximated using Then S coll C reduces to Note that this graph has a single 1 pole corresponding to the collinear-but-not-soft region. The amplitude S coll D is the same as S coll C . In summary, extracting just the IR poles with M 0 = −eu f γ α u i the tree-level matrix element. Summing these graphs, the IR divergences all cancel. Note that this is a different mechanism from the way the cancellation happens in a matching calculation for the DIS Wilson coefficient in SCET [68]. There, the soft graph is subtracted from the full theory graph (S A − S B ) to achieve the cancellation. Here those graphs are added, and additional graphs come in to effect the cancellation.

S H in dimensional regularization
Imposing cutoffs on the asymptotic Hamiltonian is useful for showing the cancellation of IR divergences. In practice, however, the calculations are much simpler using pure dimensional regularization. Dimensional regularization respects both Lorentz and gauge invariance, while explicit cutoffs do not. Moreover all 1PI graphs involving vertices in the asymptotic region are scaleless and formally vanish. This follows from simple power counting arguments: in the soft limit, we take all hard scales to infinity so there are no scales left for the amplitude to depend on. In collinear limits, only lightlike momenta in one direction are relevant and no Lorentz-invariant scale can be constructed from collinear lightlike momenta.
For an explicit example, consider the soft graph S B , from Eq. (81) The integral over ω k is scaleless and formally vanishes in dimensional regularization. Note that there is also a IR divergence in this case in the angular, x, integral, so the final result has an overlapping UV/IR 1 UV 1 IR singularity. Such singularities never occur in renormalizable theory, but they do occur in SCET. However, since when one adds up all the diagrams we know that the IR divergences cancel, the overlapping UV/IR divergences must cancel as well. These cancellations have been studied extensively in SCET (see the reviews [56,57]).
Thus the only non-vanishing graphs in pure dimensional regularization are those with all vertices in the central region. In the central region, hard interactions are present, and these are associated with particular scales. In d = 4 − 2 dimensions, in Feynman gauge, the result for the loop is given in Eq. (70). For this diagram, the UV and IR divergences can be unambiguously separated since the UV divergences are known separately to be cancelled by the ordinary QED counterterms. For S H diagrams, such a separation is also possible, but much more difficult, since there can be overlapping UV and IR singularities (see [69][70][71] for some discussion).
In any case, since the other diagrams contributing to S H are scaleless and since S H is IR finite, we can immediately write down the the bare S H amplitude using Eq. (70). Writing, we then have The renormalized S H -matrix element is related to the bare one by operator renormalization. To remove the UV divergences, we can rescale the S-matrix by So that the renormalized matrix element in MS is then which is UV and IR finite. It may seem surprising that Z can depend on the scale Q: normally Z-factors are just numbers. In fact, the Q dependence is just shorthand for a more formal dependence of the S H -matrix elements on the labels of the collinear fields. In the label formalism, the S-matrix for e − (p 1 )γ ⋆ (q) → e − (p 2 ) can depend on its labels, which are the large components of the momenta of the collinear particles, p − 1 =n 1 ⋅p 1 ∼ Q and p + 2 =n 2 ⋅p 2 ∼ Q. These labels are non-dynamical, and so the Z-factor can depend on them. Thus, one could more pedantically write and S H,bare . But writing the dependence as on Q or more generally s ij = (p i +p j ) 2 is simpler.
It is perhaps worth commenting on why S H needs to be renormalized in the first place. The traditional S-matrix is also an operator, however it does not normally get an operator renormalization: its UV divergences are cancelled by rescaling the interaction strengths in the Lagrangian and the fields. The reason S H needs to be renormalized is due to diagrams that have both interactions in the asymptotic regions and hard momentum flowing through the graph due to interactions in the central region. The soft particles in H as cannot resolve the hard scales and there are no interactions in H as which could be renormalized to remove the associated UV divergences. While S-matrix elements are smooth, differentiable functions of momenta, the smoothness is lost in the soft power expansion generating S H . Thus hard scattering, from the point of S H looks instantaneous and non-local, like a sharp, non-differentiable cusp at the hard vertex. In other words, the additional renormalization required in S H is the same as the need for renormalization associated with cusps in Wilson line matrix elements. The non-locality of SCET (on hard length scales) and cusp renormalization is discussed more in [56,57].

QCD: e + e − → jets
To illustrate the use of S H to compute infrared-safe observables, we will explore as an example, the computation of thrust in e + e − events to NLO in QCD.
The hard matrix element for γ ⋆ →qq is the same as for DIS, up to a crossing. Explicitly, Due to the ln(−Q 2 − iε) term, this S H -matrix element is complex. The imaginary part is the leading order expansion of the Coulomb/Glauber phase, and is present in processes with more than one charged particle in the initial or final state.

Glauber graph
It is perhaps illuminating to see the origin of the imaginary part from the relevant asymptoticregion graphs. Part of the reason this question is interesting in our framework is because Glauber gluons are normally associated with purely off-shell modes, with entirely transverse momentum. In time-ordered perturbation theory one has only on-shell modes. So how is the Glauber contribution going to be reproduced? Before power expanding in the soft region, the relevant time-ordered diagram is (up to some prefactors): If we were to enforce 3-momentum and energy-conservation in the central region, this would force ω 1+k = ω p 2 +k = ω 1 = ω 2 = Q 2 . Then ⃗ k must have exactly zero energy, as expected for an off-shell mode, and the integrand appears ill-defined. The problem however is not that k is off-shell, but that we have not been sufficiently careful handling the product of distributions.
To properly evaluate the integral, we must be patient in enforcing the energy conservation in the central region. Recall that energy conservation comes from integrating over −∞ < t < ∞. If we break the central region up into a −∞ to 0 region and a 0 to ∞ region, then the hard vertex can be in only one of the regions. Let us also pretend for now that H as is the same as H with the exception of the hard vertex. Then, if the hard vertex is at t < 0, the evolution from e −iHt from 0 to ∞ will be exactly be cancelled by the evolution from t = ∞ to 0 in the asymptotic region. That is In equations, the cancellation occurs point-by-point in phase space as where In the real case, where H as is not exactly the same as H without the hard vertex, these graphs will not sum to precisely zero, but to something that is IR finite. The cancellation of the graphs with the hard vertex at t < 0 implies that the nonzero contribution of the graph in Eq. (101) comes from the region where the hard vertex is at t > 0. So we must look at Now we only have 3-momentum conservation, not energy conservation. So, ⃗ p 1 + ⃗ p 2 = 0 and thus ω 1 = ω 2 , but nothing forces ω 1 = Q 2 . Defining the angle between ⃗ k and ⃗ p 1 as θ, in the soft limit ω 1+k ≅ ω 1 + ω k cos θ and ω 2−k ≅ ω 2 + ω k cos θ, so performing the power expansion results in The ω k integral is scaleless, being both UV and IR divergent. The iπ in this expression corresponds to the imaginary part in Eq. (100), and is known to exponentiate into the Coulomb/Glauber phase. The third graph in Eq. (102) is similar, leading to the same result as in Eq. (106) with i ω 1 +ω 2 −Q+iε replaced by i Q−ω 1 −ω 2 +iε . The two graphs combine to produce the expected δ(ω 1 + ω 2 − Q) factor.
So we see that the Glauber phase is indeed reproduced by asymptotic diagrams with onshell modes. Moreover, energy is conserved in this process. The key was to carefully handle the imaginary parts of the propagators and δ distributions. There are of course many other ways to compute this imaginary part (cf. [59]), but this approach clarifies the importance of carefully treating energy conservation in S H computations.
In more complicated processes, such asqq →qq in QCD at 2 loops, it is known that the Glauber contribution from the full graph (the central region) is not reproduced by the eikonal approximation [72]. Consequences of this failure include collinear-factorization violation [73] and the emergence of super-leading logarithms [74]. For S H this means that the IR divergences of the central region will not be canceled by an asymptotic Hamiltonian with soft and collinear gluons alone. Fortunately, it has been shown that one can add to the SCET Lagrangian a set of Glauber operators [58] and remedy the failure of the soft limit. A detailed discussion of when these operators are relevant and how they resolve issues such as collinear-factorization violation can be found in [75]. The Glauber interactions, like soft interactions, are long distance and persist will after the hard scattering. Although they violate factorization, in the sense that they are long-distance interactions that depend on multiple directions, they are still independent of the hard scattering.
To connect the Glauber graph M G to the Glauber operator, we can massage the imaginary part of the integral in Eq. (105) into a more familiar form, using the equation. We first drop the i in the denominator 1 cos θ−1−i , since the endpoint singularity at cos θ = 1 is regulated for < 0 by the (1 − cos 2 θ) − factor in the measure (see Eq. (74)). Rewriting the integral in terms of k z = ω k cos θ and ⃗ k ⊥ gives To take the imaginary part we now use Im 1 kz−2i = πδ(k z ) and integrate over k z to get integrand is exactly what comes out of theξ n 1 n 2 2 ξ n 1 1 P 2 ⊥ξ n 2 n 1 2 ξ n 2 Glauber operators [58,59,76]. In other words, tree-level exchange in the asymptotic region corresponds to the Glauber region expansion, except it has an opposite sign. Note that since k z = 0 the on-shell energy of the Glauber gluon is is not coming from an off-shell mode but rather from energy not being conserved in time-ordered perturbation theory. Alternative ways of understanding the Glauber phase can be found in [52,53,77,78].
For completeness, we list the IR divergent parts of the various contributions to S H for this process cutting off the UV divergence of the soft integrals at ω max , as in Section 4.1.
withμ 2 = 4πe −γ E µ 2 and M 0 = u 1 γ α v 1 the tree level matrix element. Summing these graphs, the IR divergences all cancel. Note that while the imaginary part of the Glauber graph, Eq. (110), cancels against the S-matrix graph, Eq. (109), the real part of the Glauber graph has the same sign as the S-matrix graph, and the sum of the two cancels against the cut graphs. This is different from how the cancellation occurs in matching to a 2-jet operator in SCET, where a single soft graph cancels both the real and imaginary parts of the divergences of the full-theory graph.

Thrust
Next, let us use the hard S-matrix to compute the thrust observable [79]. Thrust is a particularly simple infrared-safe e + e − observable. It is defined as It is convenient to use τ = 1 − T rather than T . Thrust has the property that for events that consist of two highly collimated jets τ ≪ 1. At small τ , thrust is approximated by the sum of the masses of these two jets τ ≅ 1 Q 2 (m 2 J1 + m 2 J2 ), with Q the center of mass energy. Events that are more spherical have values of τ ∼ 0.2 − 0.5.
To compute dσ dτ in perturbation theory using S H , we start at lowest order, where the hard S-matrix element is At next to leading order we need the hard matrix element forqq final states at NLO, as given in Eq. (100) We also need the matrix elements for γ ⋆ →qqg where we treat the gluon as hard. Treating it as hard, the only contribution at order g s is from diagrams with all vertices in the central region. Then this amplitude is identical to the S-matrix element for the same process To compute the observable, we must then evolve these final state to t = +∞ using H as , as discussed in Section 2.2. On the formal level, this additional evolution exactly cancels the entire effect of H as , so the cross section predicted is identical to that using the original S. On the practical level, however, one can gain additional insight into the distribution by actually using the S H -matrix elements we have computed, rather than simply discarding them and starting over. To this end, it is helpful to contemplate the small τ and moderate τ regions separately. For small τ , the gluon is necessarily soft or collinear. Thus we can disregard the hard M(γ ⋆ →qqg) contribution. Instead, we should start with M 0 (γ ⋆ →qq) and then evolve thē qq final state towards a 2-jet state with nonzero τ using H as . To compute the cross section, we need to sum the cut graphs In these graphs each dotted green line represents a separate contribution where the measurement function at t = ∞ is inserted. The first two graphs have only soft contributions and the second two soft and collinear contributions (although the soft ones vanish in Feynman gauge). The middle cuts in these graphs using soft interactions in the asymptotic regions corresponds to soft real-emission processes. The amplitude for soft emission using H as is the eikonal limit of Eq. (116), with an opposite sign and without the hard matrix element, Then the contribution to the differential thrust cross section at order α s from these four cuts is (119) In this expression, the θ-functions project onto the appropriate hemisphere defined by the thrust axis (which aligns with theq − q axis at leading power). The first and third cuts in all the graphs, using soft interactions, give the virtual contributions. Summing all of them, the result is the same as the contribution to thrust from the thrust soft function [75,80]: Although the real and virtual contributions are separately infrared divergent, the final contribution to the cross section is not. Similarly, the contribution from collinear graphs gives the jet functions. The net contribution is Multiplying these by the S H -matrix element squared, the sum is This agrees with the exact NLO thrust distribution at leading power (see [81]). Note that the µ dependence of S H -matrix elements exactly cancels against the µ dependence of the soft and collinear contributions in the asymptotic region. For values of τ that are not small, one should necessarily treat the gluon as hard. The measurement function in this region is therefore only sensitive to hard particles. Since there are no asymptotic interactions between hard gluons and hard quarks, the S H -matrix element in this regime is the same as in Eq. (116). Integrating the square of this matrix element against the thrust measurement function gives for τ > 0, Near τ = 0 this contribution coming from M(γ ⋆ →qqg) is singular, and the phase space integral is IR divergent. However, at τ = 0 there is also the contribution from M(γ ⋆ →qq).
Although we can define the measurement function so that it is not sensitive to any gluon that couples in H as , we cannot remove the soft and collinear gluons form H as . These gluons still contribute to the cross section through loops, and affect the thrust distribution at τ = 0. The virtual graphs are the first and third cuts in all the diagrams in Eq. (117). These graphs are IR divergent. If we work in 4 − 2ε dimensions, the full phase space integral over the 3-jet contribution M(γ ⋆ →qqg) 2 generates 1 2 IR and 1 IR poles that exactly cancel the 1 2 IR and 1 IR from the virtual graphs. The result is that which is the exact NLO thrust distribution in QCD.
So we see that S H is capable of both reproducing distributions in fixed order QCD and, through the asymptotic expansion, reproducing just the leading-power parts of those distributions. An advantage of leading-power approach is that one is not forced to compute the S H -matrix elements and the asymptotic evolution to the same order in α s . Instead, one can use exponentiation properties of the soft and collinear emission to evaluate the asymptotic evolution to all orders in perturbation theory. In particular, one can perform resummation with the renormalization group, since the soft and collinear contributions each are associated with only a single scale. Doing so in this example reproduces the resummed thrust distribution computed using SCET [75,80].

N = 4 Super Yang-Mills
To further illustrate the features of S H , we now consider amplitudes in N = 4 Super Yang-Mills (SYM) theory. N = 4 SYM is a superconformal SU (N c ) gauge theory in which scattering amplitudes have been studied quite extensively. To leading order in 1 Nc , the only Feynman diagrams that contribute have planar topology and each loop order gives an additional factor of the 't Hooft coupling λ = g 2 s N c . Since only one color structure is relevant at large N c , is is convenient to factor out the group theory factors. In addition, the amplitude for n-gluon scattering is totally symmetric in the permutation of the external legs. With these observations, it is conventional to write the L-loop amplitude with n external legs as where sum is over non-cyclic permutations ρ of the external legs. The arguments ρ(1), etc., refer to the permutation of the momenta and helicities of the legs. It is furthermore convenient to scale out the kinematic dependence of the tree-level amplitude by defining In addition, we will find it useful to discuss the terms of each order in separately, so we write M (L) and decompose other quantities analogously.
In general, the bare n-leg L-loop amplitude is an extremely complicated function of the external momenta, even for planar maximal-helicity violating (MHV) amplitudes. What is interesting though is that there seems to be structure in the L-loop amplitude after the 1-loop amplitude is subtracted. More precisely, the ABDK/BDS ansatz proposes that the L loop amplitude should be expressible in terms of the 1-loop amplitude and some transcendental constants [82,83]. More precisely, the full matrix element with n legs has the form where f (L) ( ) is independent of n and related to the cusp anomalous dimension (explicitly f (1) ( ) = 1 and f (2) ( ) = −ζ 2 − ζ 3 − ζ 4 2 + ⋯. The numbers C (L) are also independent of n and represent the part of the L-loop amplitude not given by the exponentiation of the first term. By explicit computation it is known that C (1) = 0 and C (2) = − 1 2 ζ 2 2 . Finally, E (L) n ( ) has only positive powers of , so that E (L) n (0) = 0. It turns out the BDS ansatz was not quite correct: there is more structure to the amplitudes than just the numbers C (L) for n > 4. Thus, it is common to express amplitudes as ratios of the bare amplitudes and the BDS ansatz. More precisely, the remainder function is defined as and one can expand R n order-by-order in g s . While the remainder functions have some nice properties, such as respecting dual conformal invariance, they violate other conditions, such as the Steinmann relations [84]. To preserve the Steinmann relations, the BDS ansatz is modified to the "BDS-like" ansatz [85]. For certain amplitudes (n = 8 for example), it has been shown that both the Steinmann relations and dual conformal invariance cannot be satisfied simultaneously [86]. That the BDS ansatz violates the Steinmann relations is due to the additional subtraction of finite, O( 0 ), terms in Eq. (128) in addition to the IR divergences. A more conservative ansatz is the "minimally-normalized" amplitude M min n defined as [87] M min where the IR divergences of M The ratio Mn In this section we relate some of these observations to the hard S-matrix element. We will see that the hard S-matrix element computed in MS corresponds closely to the minimally normalized amplitude.

4-point amplitude
We begin by discussing the MHV amplitude with 4 external legs. The IR divergences of the 1-loop amplitude for n = 4 are known to agree with the divergences of and the divergences of the 2-loop amplitude agree with the divergences of These formulas are due to Catani [88] (see also [89]). Note that C   4 ( ), its appearance in the universal formula can be understood from the point of view of effective field theory [90]: it comes from a cross term between the non-universal 1-loop Wilson coefficient and the universal 1-loop divergences. An equivalent mechanism explains its appearance during the computation of S H , as we now show.
With 4 legs (n = 4), the 1-loop amplitude is where M (1) loops. This amplitude is IR finite, but UV divergent. Denoting M (L) n ( ) Â (0) n ( ) in analogy to Eq. (126), the 1-loop bare hard matrix element is the same as Eq. (134) with IR replaced by UV . The renormalized matrix element is then where, with minimal subtraction (MS) Note that M

. The result is that
This matrix element is significantly simpler than M 4 ( 0 ) in Eq. (141), and does not require any ad-hoc subtractions.

Scheme choice
Dimensional regularization and minimal subtraction is the most widespread scheme in use in SCET. We must keep in mind, however, that due to renormalization there is scheme dependence in S H . This is not a problem per se, since S H itself is not directly observable. One expects that one S H -matrix elements are combined into observables the scheme dependence will cancel. Indeed, the cancellations that occur will be similar to the cancellations that occur in SCET. For example, Ref. [91] showed that physical observables agree when conventional dimensional regularization, four-dimensional helicity scheme, or dimensional reduction are used, despite the fact that the hard, jet and soft functions are different in the different schemes. In a normal, local field theory, the counterterms are strongly constrained: they must just be numbers. In SCET the counterterms can depend on the labels for the various collinear directions which translates to dependence of hard kinematical quantities, like s and t, as in Eq. (144). However, one cannot choose an arbitrary function of labels, as the dependence must be canceled by contributions from soft and jet functions. Roughly speaking the combination, H ⊗J ⊗S must be scheme independent, where the hard function H corresponds to the square of our hard S-matrix elements. More discussion of these constraints can be found in [91].

Summary and Outlook
The traditional S-matrix is only well-defined if time-evolution of a theory is well-approximated by free evolution at early or late times. Indeed, the free Hamiltonian H 0 is part of the definition of S used for perturbative calculations. When a theory has massless particles, the interactions do not die off fast enough at asymptotic times, resulting in a poorly defined, divergent S-matrix. We argue that a sensible, finite S-matrix is obtained by replacing H 0 in its definition with an asymptotic Hamiltonian H as that correctly accounts for all the asymptotic interactions. Our key principle for choosing H as is that the states should evolve before and after they scatter independently of how they scatter. That such an H as exists and makes the S-matrix finite is guaranteed by theorems of hard-collinear-soft factorization. Capitalizing on these theorems, we define H as as the leading power expansion of the full Hamiltonian in soft and collinear limits, and call the corresponding S-matrix the hard S-matrix, S H . S H is finite order-by-order in perturbation theory, as we have verified through a number of explicit examples in QED, QCD and N = 4 super-Yang-Mills theory.
While the traditional S-matrix is IR divergent, it can still be used to compute IR-finite observables. This is done by summing over a broad enough set of processes so that the sum is finite even though individual contributions are divergent. With S H , the same physical predictions result using the matrix elements of a scattering operator that are finite processby-process.
We presented a method and Feynman rules for the perturbative calculation of S H -matrix elements. The method involves separating S H into three parts: An asymptotic part evolving the state from t = 0 to t = −∞, the evolution from t = −∞ to t = ∞ and an asymptotic part evolving from t = ∞ to t = 0. Each asymptotic part is calculated using Feynman rules similar to those in time-ordered perturbation theory but without overall energy conservation, and the middle part consists of conventional Feynman diagrams. The three part picture is presented for calculational convenience, since it breaks up calculations into essentially usual time-ordered perturbation theory and Feynman diagrams, and bypasses the need to derive a new interaction picture with modified propagators.
The hard S-matrix has has numerous advantages over the traditional S-matrix. The first advantage is the obvious one: S H exists. Second, matrix elements of S H have a rich structure with diverse interpretations. One can interpret the asymptotic evolution as dressing the states, so that a initial Fock state with a finite number of particles evolves into a dressed state with an infinite number of particles at asymptotic times. This connects our construction to previous work on coherent states, such as by Chung [24] or Faddeev and Kulish [33]. Alternatively, S H -matrix elements can be interpreted as Wilson coefficients in Soft-Collinear Effective Theory. Finally, S H -matrix elements are closely related to finite remainder functions studied in the amplitude community. Indeed, much of the progress in understanding scattering amplitudes over the last few decades has comprised results about an object, the S-matrix, that formally does not exist. Since there is so much interest in the S-matrix itself (as opposed to cross sections), it is logical to try to put this object on a firmer theoretical footing. Doing so was one of the main motivations of this paper.
There are a number of new ideas contained in this paper. These include: • The first explicit calculation of a finite S-matrix in theories with massless particles. While other authors have introduced similar concepts in QED, there are no explicit calculations in the literature of actual matrix elements. The majority of papers focuses on just the IR divergence cancellation. Issues such as regulator dependence, renormalization, subtraction schemes, phase space integrals, computation of observables, completeness of the Hilbert space, etc., are all glossed over unless one is able to do explicit computations.
• We present a new rationale for choosing the asymptotic Hamiltonian. While others have argued that the asymptotic Hamiltonian should make the S-matrix IR finite, we argue that such a criterion is not restrictive enough: one could choose H as = H to satisfy that requirement. Instead we argue that one should use that the asymptotic evolution is independent of the hard scattering. That there exists an asymptotic Hamiltonian with this property in gauge theories is non-trivial and follows from factorization theorems.
• We connect the literature on coherent states to that of factorization and that of scattering amplitudes. In particular, the hard S-matrix elements can be identified as S-matrix elements of coherent states, as Wilson coefficients in SCET, and as finite remainder functions in N = 4 SYM fields corresponding to BDS-inspired subtraction schemes.
• We provide an explicit set of Feynman rules to evaluate S H elements in perturbation theory. These rules involve distributions and products of distributions that must be handled with some care.
• We provide a number of examples of S H -matrix element calculations, both using pure dimensional regularization and with explicit cutoffs on H as .
• We examine how the Glauber/Coulomb phase arises in asymptotic-region diagrams.
In particular, energy non-conservation in the asymptotic regions allows the Glauber contribution to be reproduced (and cancelled) without off-shell modes.
• We demonstrate that infrared-safe observables computed with S H will agree with those computed using the normal S-matrix, and, to leading power, with those computed using SCET or other factorization frameworks. We are not aware of any paper on dressed states that makes a physical prediction using them. In our framework, one can see how the dressing occurs, but also how the states get "undressed" in the final asymptotic evolution before the measurement is made.
• Although predictions using S H reduce (almost trivially) to predictions using S, matrix elements of S H can be studied as interesting objects on their own. These matrix elements are scheme and scale-dependent, but still have physical interpretations, just like the MS couplings α s (µ).
These last two bullets are perhaps worth some additional discussion. The incontrovertible truth is that cross sections computed with S, despite coming from IR-divergent amplitudes, are in perfect agreement with observations. Thus, no matter how one attempts to make scattering amplitudes finite, the framework must reproduce these cross sections exactly. In other words, it is foolhardy to try to make different predictions at the cross section level with a new S-matrix. That being said, there are situations, in particular those with charged initial states such as e + e − + photons → Z + photons, where it is not entirely clear what the physical cross section is supposed to be [14]. In such situations, a finite S H may provide some clarity.
Although we cannot expect S H to revolutionize the computation of physical cross sections, having a finite S-matrix is still enormously beneficial for the study of scattering amplitudes themselves. Indeed, the majority of research of scattering amplitudes focuses on S-matrix elements themselves, not on observables. So it is this community that might benefit first from S H . As an example, we showed that certain S H elements in a supersymmetric theory naturally satisfy the Steinmann relations, at least to two loops. In contrast, S-matrix elements are IR divergent and, depending on how the IR divergences are subtracted, the Steinmann relations may or may not be satisfied. More broadly, because S H corresponds to the matrix elements of a single unitary operator, rather than a ratio of such matrix elements, it should automatically satisfy any constraints that follow from unitarity. One might also imagine that properties stemming from analyticity would be more transparent in matrix elements of a single operator rather than a ratio.
Finally, let us briefly discuss how to think about S H non-perturbatively. In this paper, we have advocated for computing S H in dimensional regularization with MS subtraction. At each order in perturbation theory, one can compute S H elements this way. It may seem counterintuitive, but perturbation theory has historically been the best way to orient investigation into non-perturbative physics, and a perturbative approach could be similarly successful for S H . One can also resum S H using renormalization group techniques to examine its all-orders behavior. Alternatively, one could (in principle) compute S H numerically with hard cutoffs, but to compare to the perturbative results in dimensional regularization, one would have to convert between the cutoff scheme and MS. Through various approaches like these, it should be possible to explore the analytic structure of S H . It would be interesting to look at its properties in the Borel plane, for example, or whether a renormalon-free mass scheme naturally emerges. More generally, since S H is IR finite, it resembles more closely S in a theory with a mass gap than the IR-divergent S. Thus one might hope that when massless particles are present, the S-matrix bootstrap program might make more progress with S H than it has on S.