Diﬀractive deep inelastic scattering at NLO in the dipole picture: the q ¯ qg contribution

We calculate the contribution from the q ¯ qg state production to the diﬀractive cross sections in deep inelastic scattering at high energy. The obtained cross section is ﬁnite by itself and a part of the full next-to-leading order result for the diﬀractive structure functions. We perform the calculation in exact kinematics in the eikonal limit, and show that the previously known high- Q 2 and large M 2 X results for the structure functions can be extracted from our results in the appropriate limits. We furthermore discuss the steps required to obtain the full next-to-leading order results for the structure functions.


I. INTRODUCTION
In collisions with high scattering energy, one is measuring degrees of freedom of hadronic and nuclear states that only have a small fraction of the full momentum of the state, small-x degrees of freedom. The large amount of phase space available at high collision energies leads to an exponentially cascading emission of gluons. At some point, however, this cascade must be limited by unitarity requirements on scattering amplitudes. Thus gluon mergings eventually start to be equally important, even at transverse resolution scales where a weak coupling description is appropriate. The kinematical region where these two effects balance each other is referred to as the gluon saturation regime, and understanding it is the topic of many theoretical and experimental efforts. Experimentally the saturation regime is relevant for understanding hadronic collision processes and the formation of quark gluon plasma at RHIC and the LHC. A particularly precise and clean way to access the small-x degrees of freedom is, however, provided by high energy Deep Inelastic Scattering (DIS), both in the HERA experiments, and at the future Electron-Ion Collider (EIC) [1][2][3] and LHeC [4,5]. Theoretically, a convenient way to discuss the physics of gluon saturation is provided by the Color Glass Condensate (CGC) [6][7][8] effective field theory, where the nonlinear gluon system is described as a classical color field. For the DIS process, the CGC framework naturally leads to the dipole picture [9][10][11][12][13].
In the dipole picture one factorizes the DIS process of a virtual photon off a hadronic target into two ingredients. Firstly the perturbative part of the process is the development of the photon into a partonic state, to leading order a color neutral quark-antiquark dipole. The second ingredient is the scattering of this partonic state with the gluonic target, which in the high collision energy limit can be treated as an eikonal interaction with the classical color field. With the prospect of higher luminosities and the availability of nuclear targets in future DIS experiments, there has been a systematic push in the field to improve the perturbative accuracy of the dipole picture by going to higher orders in perturbation theory. In recent years the dipole picture has been extended to NLO accuracy for the high energy BK/JIMWLK evolution [14][15][16][17][18][19][20][21][22][23][24][25][26][27] and the inclusive DIS cross section [28][29][30][31][32][33][34][35][36][37][38][39][40].
Exclusive or diffractive DIS is expected to be even more sensitive to gluon saturation than inclusive cross sections [3,[41][42][43]. One way to understand this is to note that, due to the optical theorem, the total cross section is proportional to the elastic dipole-target amplitude, proportional to the gluon distribution in the target. Exclusive cross sections, on the other hand, are calculated as the square of the amplitude, and are thus much more sensitive to the large amplitudes, a signature of the saturation regime. Correspondingly, the recent work on inclusive scattering has been accompanied by several calculations of exclusive vector meson and diffractive dijet production at NLO in the dipole picture [44][45][46][47][48][49][50][51][52][53]. While these processes are an extremely important part of the coming experimental program, they both have some drawbacks for the purpose of understanding gluon saturation. Exclusive vector meson production requires some knowledge or modeling of the bound state physics of the meson. While there are systematical ways to do this perturbatively, e.g. by a nonrelativistic QCD approach as in Ref. [48] or by using universal Parton Distribution Amplitudes that can independently be measured in other processes [46,50,54], this unavoidably adds an additional source of uncertainty. Dijets, on the other hand, are well defined perturbative objects in a given jet algorithm, but only if the jet transverse momenta are sufficiently large. At realistic collider energies this has a tendency to push jet measurements to larger x and thus outside the saturation regime.
In this paper we will focus on a process that has gathered somewhat less attention in the work to push the dipole picture to NLO accuracy, namely inclusive diffractive DIS. Here the experimental signature is a large rapidity gap between the diffractive system (X) consisting of the photon remnants, and the target or its remnants. In the dipole picture the photon fluctuates into a variety of partonic states, which scatter off the target without exchanging color. In this sense the rapidity gap makes the process fundamentally an exclusive one, with a cross section given by the square of an amplitude. On the other hand, the measurement is inclusive in the sense that one sums over all of the different final states of the diffractive system, measuring the cross section differentially only in its total invariant mass M X . This latter inclusive aspect makes it possible to extend the perturbative description to much lower invariant masses and to lower x than for diffractive dijets, even if the parton level final states are the same. The cross sections for such processes are expressed in terms of the diffractive structure functions F D(3) 2 (β, Q 2 , x P ) and F D(3) L (β, Q 2 , x P ) or, equivalently, the diffractive virtual photon cross sections dσ γ * +A→A+n / d[PS] n , which are the quantities that we will calculate here.
At leading order in α s , the diffractive final state only consists of a quark-antiquark dipole. This already provides a good description of the general features of the experimental measurements at small M 2 X ∼ Q 2 [41]. However, a strict leading order picture fails to describe the rise of the cross section towards larger M X where, in order to make a high invariant mass partonic state, additional gluon radiation is required. The phenomenologically most successful approach has been to use the "Wüsthoff result" [55], which includes the radiation of one extra gluon into the final state [56][57][58][59][60][61] in a large Q 2 kinematical approximation. In our terminology, this tree-level gluon emission is already a part of the NLO result, being explicitly proportional to α s . In this paper we will calculate the same contributions as in the Wüsthoff result at what we call the exact kinematics in the eikonal limit. This means that the kinematics within diffractive system is treated exactly without a large Q 2 approximation, while the interaction with the target is eikonal.
Our result presented in this paper corresponds to a subset of the NLO result that is finite by itself, and suited for explicit numerical evaluation. The completion of the full NLO calculation requires the inclusion of virtual corrections with gluons that are not produced in the final state. We plan to return to these contributions in future work. Here we will merely outline steps that are needed to calculate these virtual contributions in our formalism. We have also here opted to calculate only the contributions where the gluon is emitted before the shockwave, not combining them with emissions after. This avoids issues with collinear and soft divergences in final state emissions, which eventually cancel against virtual corrections.
This paper is structured in the following way. We will start by introducing the experimental observable, the diffractive structure function, in Sec. II and the dipole picture formulation in terms of LCPT in Sec. III. Before moving to specific diagrams we will then discuss in Sec. V the general strategy to calculate phase space integrals for 2-and 3-particle final states of a fixed invariant mass M X in the context of an eikonal scattering picture where the interaction with the target happens at a fixed transverse coordinate. We will rederive the known result for the leading qq component of the wavefunction in our notations in Sec. VI. We then move to the main new result of this paper, the calculation of the qqg component of the cross section in full kinematics in Sec. VII. A more detailed exposition of intermediate stages of the calculation has been presented earlier in Ref. [62]. We check in Sec. VIII that our calculation reduces to known results in the in the kinematical limits of large Q 2 (Wüsthoff [55]) and large M X (e.g. in Ref. [63]), before concluding in Sec. IX. The results in this paper cover a finite, self-contained subset of the NLO corrections to the diffractive results, generalising earlier calculations to the full kinematics. We plan to return to the calculation of the remaining parts in a future publication, as outlined in Sec. III.

II. DIFFRACTIVE STRUCTURE FUNCTIONS
The diffractive cross section σ D e+A→M X +p in electron-nucleus (or electron-proton) DIS integrated over the squared momentum transfer t is usually expressed in terms of the diffractive structure functions F D 2 and F D L defined as The superscript (3) refers to the structure functions that depend on three variables, in this case β, Q 2 and x P discussed above. One can also consider structure functions differentially in the squared momentum transfer t, in which case one has the structure functions F D(4) 2,L (β, Q 2 , x P , t). In this work we consider both t-differential and t-integrated cross sections. For simplicity we focus here on coherent diffraction which corresponds to the events where the target does not dissociate, but our results are straightforward to generalize to dissociative events in the Good-Walker [64] picture (see e.g. Refs. [65][66][67][68]).
The diffractive structure functions are related to the total diffractive cross sections in γ * + A scattering as where T and L refer to transversely and longitudinally polarized photons. The experimentally measured [69][70][71][72] total diffractive cross sections are usually reported in terms of the diffractive reduced cross section Here the Lorentz-invariant quantities describing the kinematics are the virtuality of the photon −Q 2 and the fraction of the target longitudinal momentum x P carried by the pomeron (exchanged in the scattering process) in the frame where the target has a large longitudinal momentum, defined as The invariant mass of the diffractively produced system is denoted by M 2 X and the nucleon mass by m 2 N . The variable has, in the frame where the target momentum is large, an interpretation as the fraction of the pomeron momentum carried by the struck quark. The four vectors P, P are the target nucleon momenta before and after the scattering respectively, see Fig. 1. Finally y = (P · q)/(P · l) is the inelasticity describing the energy transfer from the lepton with initial momentum l, and q is the photon momentum.
We are working in the dipole picture, where one develops the virtual photon state in a series of partonic Fock states. Let us first consider the general case of an n-parton Fock state, for which we denote the phase space element as [PS] n , At leading order only the n = qq state contributes, and in this work we focus on including the n = qqg contribution which is actually the dominant component at high M 2 X (small β) and at high Q 2 [59]. This tree-level contribution is also a necessary ingredient for the future full calculation of the diffractive structure functions at NLO accuracy. The diffractive cross section can be written as where M 2 n is the invariant mass of the Fock state n. The cross section dσ γ * A→A+n for a production of a color-singlet state n in photon-nucleus or photon-proton scattering is expressed in terms of the scattering amplitudes (see [73], except we now normalize with a 2q + in a different place) as: Here i ∈ F.s.n means iterating over all the particles i in the Fock state n and q + n is the total plus momentum of the partons in this Fock state. The one particle phase space element reads The scattering amplitude is obtained from the matrix elements of dressed, interacting, states by leaving out a momentum conservation delta function where in the diffractive scattering the final state F.S. n is a color singlet. At high energies, the scattering amplitudes M γ→n can be calculated by considering the γ → n process, and inserting an interaction with the shockwave (target color field) in all possible ways. At high energies the transverse coordinates of the partons are fixed when they propagate in the color field of the target and as such the interaction can be straightforwardly described in terms of Wilson lines at fixed transverse coordinates as discussed in more detail in Sec. IV.

III. OUTLINE OF NLO CALCULATION
We will in this paper compute a part of the NLO correction to the diffractive structure functions that is finite by itself. However, let us first discuss the overall structure of the full NLO contribution in terms of the contributing diagrams. This discussion will make it more clear which parts of the NLO contributions are included in our result here, and what still needs to be done to calculate the rest. We are calculating, and drawing diagrams, for an exclusive  amplitude for a virtual photon-target shockwave scattering. Thus a single diagram includes both the Fock state expansion of the incoming dressed virtual photon state |γ D , and that of the outgoing dressed multiparton state D F.s.n| in the amplitude (8). In the diagrams the shockwave is represented by a blue band, with time progressing from left to right. The state furthest to the left is the asymptotic incoming state (the dressed virtual photon), which then develops into a superposition of bare parton states, corresponding to the LFWFs γ → n. The interaction with the shockwave then is given in terms of the bare states [74]. On the other side of the shockwave, furthest to the right, is the asymptotic final state D F.s.n|, which develops into a superposition of bare states, going leftward in the figure. Thus the part of the figure to the right of the shockwave corresponds to the complex conjugate of the LFWF of the final state, in particular with energy denominators calculated with respect to the final state.
At leading order, the only diagram contributing to the scattering amplitude is diagram (a) shown in Fig. 2, where the photon first splits to a qq dipole, and then subsequently the quarks scatter off the target color field with no net color charge transfer to the target. In momentum space, we denote the quark and antiquark transverse momenta as p 0 and p 1 , respectively, and in transverse coordinate space use the coordinates x 0 , x 1 . Similarly the fractions of the photon plus momentum carried by the quark and the antiquark are z i = p + i /q + with z 0 + z 1 = 1.

A. Radiative corrections
The purpose of this paper is to calculate the gluon emission part of the next-to-leading order contributions to the diffractive structure functions, which dominates at large M 2 X . In the CGC, if one integrates the transverse momentum of several final state particles without restriction, one can encounter spurious UV divergences in real higher order corrections, associated with the breakdown of the eikonal approximation. This occurs when the light-cone momentum p − scales associated with the produced system become comparable or larger than that of the target. In the case of diffractive structure functions considered here, the fixed invariant mass of the produced qqg state ensures that the eikonal approximation stays valid for the whole integration range, by constraining the p − scale of the diffractive system. For that reason, no UV divergence can arise in real NLO corrections to diffractive structure functions. This is to be compared to the case of inclusive DIS [31][32][33] where one uses the optical theorem and thus the final state is completely fixed to be the same as the initial one. In dijet production [36,37,44,45,52], on the other hand, one typically fixes the momenta of some of the final state particles and integrates over the others. This can lead to a different pattern of cancellations between diagrams. The calculation of the loop corrections is left for future work, but for completeness we list all the relevant diagrams in the following subsection III B.
In order to calculate the n = qqg contribution to the diffractive scattering amplitude we include gluon emission contributions from both the quark and the antiquark. There can be a regular gluon emission before the shockwave shown in diagrams (b) and (c) in Fig. 3. In light cone perturbation theory there is also an instantaneous γ → qqg vertex resulting in instantaneous gluon emission diagrams (d) and (e) in Fig. 4. These are the contributions that we will calculate in detail in this paper. Similarly the gluon can be emitted after the shock, see diagrams (f) and (g) in Fig. 5. There are several relations between these diagrams. Firstly, since the virtual photon as a whole is color neutral, there is a destructive interference between emissions from the quark and from the antiquark. This cancels the leading small transverse momentum (collinear) divergence and serves as a useful check of the relative sign between the contributions.
The contributions with emissions after the shockwave, pictured in Fig. 5, can be conveniently obtained from the corresponding ones in Fig. 3 where the emission happens before by taking a specific coordinate limit, in a procedure developed in Ref. [75]. Here one first separates out from the coordinate space γ * → qqg wavefunction a piece corresponding to the final gluon emission. In terms of equations this means that one writes the γ * → qqg wavefunction obtained from diagrams (b), (c), (d) and (e) in a factorized form as 1 Here ψ q0→q0g2 and ψq 1 →q1g2 are the 1 → 2 particle gluon emission wavefunctions. Equation (9) should be understood as the definition of the remaining parts ψ γ * λ →q0q1;q0→q0g2 and ψ γ * λ →q0q1;q1→q1g2 . Here the notation refers to these being the parts of the wavefunction that are associated (e.g. by the helicity structure) with the first γ * → qq splitting (γ * λ → q 0q1 ), but depend on the fact that the (anti)quark will later emit a gluon (q 0 → q 0 g 2 ,q 1 →q 1 g 2 ). Thus ψ γ * λ →q0q1;q0→q0g2 and ψ γ * λ →q0q1;q1→q1g2 depend on the coordinates of all three particles in the final state. The calculation of the gluon emission diagrams after the shock wave requires the qq component in the dressed D qqg| state.  [32,33] This, in turn, requires the (qqg → qq) † merging wavefunction, which is given by (minus) the hermitian conjugates of the corresponding emission wavefunctions ψ q0→q0g2 and ψq 1 →q1g2 . In other words, one is here factoring out from the gluon emission before the shockwave the gluon emission wavefunction that appears when the gluon is emitted after. A look at the transverse coordinate space γ → qqg wavefunction (see e.g. Eqs. (C14) and (C19) in Ref. [33] for explicit expressions) shows that indeed the structure of the regular emission wavefunctions naturally factorizes like this. Using the factorized notation (9), the procedure of Ref. [75] for obtaining amplitudes for the emission-after-theshock contributions in Fig. 5 is the following. One evaluates both the Wilson line operators and the γ * → qq parts of the wavefunctions ψ γ * λ →q0q1;q0→q0g2 , ψ γ * λ →q0q1;q1→q1g2 with the transverse coordinates of the gluon and its parent (anti)quark replaced by the coordinate of the parent before the emission, and changes the sign. Thus, for the emission from the quark, diagram (b), one replaces x 0 → x 0 and x 2 → x 0 , in both the Wilson line operator and in ψ γ * λ →q0q1;q0→q0g2 , with the coordinate defined as x 0 := z0x0+z2x2 z0+z2 . Correspondingly, for the emission from the antiquark, diagram (c), one replaces x 1 → x 1 and x 2 → x 1 , with x 1 := z1x1+z2x2 z1+z2 . It is clear that this corresponds to the Wilson line operator being evaluated at the correct coordinate for the diagrams (f) and (g). There is no "emission after the shockwave" contribution for the instantaneous diagrams (d) and (e). This is, however, built into the formalism of Ref. [75], since it turns out that the parts of ψ γ * λ →q0q1;q0→q0g2 , ψ γ * λ →q0q1;q1→q1g2 corresponding to the instantaneous diagrams vanish in the coordinate limits x 0 , x 2 → x 0 and x 1 , x 2 → x 1 , respectively.
One can arrive at this procedure for constructing the final state emissions in multiple ways. In Ref. [75] it is derived explicitly by looking at the expressions and noticing a relation between the energy denominators of the different diagrams. More generally, using the orthogonality of the |γ D and |qqg D states one can derive the corresponding relation between the wavefunctions for (qqg → qq) † , γ → qq (diagrams (f) and (g) in Fig. 5), γ → qqg (diagrams (b), (c), (d) and (e) in Figs. 3,4) and the process (qqg → γ) † , corresponding to the photon crossing the shockwave first and all the emissions happening after (i.e. the bare photon state in the final D qqg|). Since the last one does not contribute to the amplitude because the photon is color neutral, one obtains a linear relation for the contributions of the emission diagrams of Figs. 3 and 5. As discussed above, one can also see this relation directly by looking at the coordinate space wavefunctions, using the Fourier transforms from e.g. Appendix C of [33].
In an inclusive observable where one integrates over the momenta of the final state gluon and of its parent without any restrictions, it would be natural to always keep the the initial and final state gluon emissions, Figs. 3 and 5, together because they have a tendency to cancel each other in the UV region where the gluon is at the same coordinate as the emitting quark. Thus, they are often evaluated together such as in Refs. [75,76]. However, for the case of the diffractive structure function, the restriction on the diffractive system mass M X cuts out contributions of large transverse momenta. Thus it is quite natural to evaluate the contributions of the emissions before and after the shockwave separately. On the other hand, the final state gluon emissions are associated with the wavefunction renormalization constants for, and gluon exchanges between, the outgoing quarks. The relation to these contributions which are, in our language, a part of the qq part of the cross section is especially important for the kinematical region when the gluon becomes collinear to the quark, where corresponding IR divergences must cancel. Thus it would not be natural here to consider the diagrams with gluon emission after the shockwave, before taking into account all the loop corrections. In conclusion, for a final state with a fixed M X , the natural way to group diagrams together is different from some other observables. Since we are here leaving the NLO qq contribution overall to a future paper, we will also not calculate the final state emission contributions here. The exception to this is in Sec VIII A, where one works in the M X → ∞ limit neglecting the restriction on final state momenta, and thus only the inclusion of the final state emissions allows one to get a finite result.

B. Loop corrections
In addition to the radiative contributions discussed in this paper, there are also loop corrections at NLO. We will return to a them in more detail in a future paper, but let us make a few remarks here. Firstly there are the one-loop corrections to the γ * → qq wavefunction, depicted in Figs. 6 and 7. These include a quark propagator corrections before the shockwave shown in diagrams (h) and (i), and corrections to the γ * → qq vertex (including regular and instantaneous gluon or quark exchange) shown in diagrams (j), (k), (l), (m), (n), (o) and (p). These oneloop wavefunctions have already been calculated as a part of the virtual photon NLO wavefunction [32,33,[38][39][40], and the results for the loop calculations can be directly taken from these references.
In diffractive scattering there are also additional diagrams that are not needed for total (inclusive) cross section and as such are not available in the literature. Now it will be necessary to add propagator correction diagrams where the gluon crosses the shockwave, diagrams (q) and (r). These exhibit UV divergences in the limit when the gluon coordinate becomes equal to the emitting (anti)quark. Similarly to the calculation of the inclusive cross section these will have to partially cancel UV divergences in the vertex correction diagrams in Figs. 6, 7. This cancellation is the reason why the one-loop γ * → qq wavefunction alone is not sufficient to directly achieve the full NLO result. In addition to the propagator correction type diagrams, similar normal and instantaneous vertex correction diagrams (s), (t), (u) and (v) depicted in Fig. 9 are also needed.
In our formalism (see [31,77] for more detailed discussions, based on the seminal work of [73]) we specifically exclude diagrams containing self-energy corrections inserted on the external asymptotic particles. Thus we do not explicitly have the diagrams (w) and (x) in Fig. 10 in our calculation. Instead, one must attach to the amplitude a wavefunction renormalization constant Z q/q (again see Eq. (11)), which includes the same physical contribution, and is determined by the unitarity of the evolution operator. The outgoing quark and antiquark wavefunction renormalization constants also have UV divergences, which should cancel the rest of the UV divergences from the vertex corrections. Finally there are also diagrams with a gluon exchange in the final state (diagrams (y), (z) and (aa)) in Fig.11. Naively, one could think that these corrections correspond to a renormalization of the outgoing qq state, but they cannot be absorbed into just a constant, since the quark and antiquark actually exchange momentum in the exchange. A proper discussion of how to define the dressed qq outgoing state and treat the interactions between the outgoing particles is a major part of the discussion of the full NLO result, which we will return to in future work.

IV. INITIAL AND FINAL FOCK STATES
We will from now on focus on the leading order qq part and the radiative qqg part of the cross section. To begin, let us define explicitly our notations and normalization for the Fock states. The Fock expansion of the incoming virtual photon state in terms of bare partonic states is where F. s. stands for "Fock states". For the outgoing partonic states the corresponding expansions are Here, we have only written out states that are needed for the full NLO cross section. In addition, marked with . . . , there are other states needed at higher orders and non-QCD Fock states, including the photon. In fact, for the radiative corrections that we calculate in this paper, only the leading order terms in Eqs. (11), (12) are needed. Strictly speaking the final state wave functions are not exactly hermitian conjugates of initial state ones, but differ by the sign of the iε in the energy denominators (see Eqs. (4.2) and (4.3) of Ref. [73]). This difference does not play a role in the calculation of this paper, but will be crucial in the calculation of the full NLO corrections to diffractive DIS, that we leave for a future publication. The notation Σ denotes the sum over the quantum numbers of each parton in the Fock state and a mixed space phase-space integration for each parton [31,62]. Here the prime in the sum denotes the fact that the original state is not included in the sum [73] 2 . For brevity the transverse coordinates and flavor (f ) and helicity (h) indices are not written down explicitly here. We will discuss the color structure of the final state explicitly below. The γ * λ -state renormalization coefficient is Z γ * λ = 1 + O(e 2 ) and so it can be dropped in this work. The renormalization coefficients Z q,q,g of the partonic states are of the order of 1 + O(g 2 ), and as such do not affect the tree-level NLO corrections that we discuss in this paper.
In the mixed space x i are the transverse coordinates and k + i the longitudinal momenta of the partons, and indices i = 0, 1 refer to the quark and the antiquark, and i = 2 to the gluon. The quark, antiquark and gluon creation and annihilation operators satisfy the (anti-)commutation relations Here a i and λ i refer to the gluon color and polarization, respectively.
2 What exactly counts as the "same state" requires a more detailed discussion in the case of a two-particle state than for one particle; we plan to return to this in a future paper in the context of the full NLO calculation where this term is needed. In practice, only the diagrams including self-energy corrections on asymptotic external legs need to be excluded, since they are already taken into account thanks to the overall renormalization constants.
In this work we consider diffractive scattering, which requires that the final state partons must be in a color singlet state (see also Ref. [83]). For both qq and qqg systems there exists exactly one such a configuration. These states are Here β 0 , β 1 are the quark and antiquark colors and b is the gluon color, and √ N c and √ C F N c are normalization factors.
Using the virtual photon Fock states it now becomes possible to express the matrix elements written in Eq. (8) in terms of the eikonal scattering operatorsŜ E describing a color rotation of a quark or a gluon in the target color field. For the n = qq Fock state we have where the superscript LO refers to the fact that we do not include loop corrections to the γ * → qq splitting in this work. Similarly for n = qqg we can write The eikonal scattering operatorŜ E acts on the bare quark, antiquark and gluon creation operators as: Here α (a) is the quark (gluon) color before the shock and β (b) after, and U F (A) (x) refer to the Wilson lines at transverse coordinate x in the fundamental (adjoint) representation, describing a color rotation of the quark (gluon) state when it propagates eikonally through the shockwave. Using Eq. (6) the leading order diffractive cross section can be written in terms of the scattering amplitude M LO where the summation is over the quantum numbers of the produced qq state (for which there is exactly one color singlet color configuration as discussed above). Similarly the cross section for diffractive qqg production can be written as dσ D, NLO γ * λ →qqg singlet := (2q + )2πδ(p + 0 + p + 1 + p + 2 − q + ) dp 0 dp 1 dp 2 h0,f0,h1,f1,λ2 where we again sum over the final state quantum numbers with only one possible color configuration.
The Wilson line structure in the scattering amplitude (20) corresponding to diffractive qq production now reads where x 0 and x 1 are the quark and antiquark transverse coordinates, respectively. In this work we consider coherent diffraction in which case the target nucleus does not dissociate and the average over the target color sources is taken at the amplitude level [64] (see also e.g. Refs. [65][66][67][68] for a discussion of the averaging procedure). This target average gives where O denotes the average over the target configurations and we have defined From the complex conjugate amplitude one obtains exactly the same structure but with the Wilson lines evaluated at different transverse coordinates x i . The Wilson line structure at the cross section level then reads The qqg production case, Eq. (21), can be considered similarly. The Wilson line structure in the amplitude (21) is Note that the factor t a α0α1 originates from the γ → qqg wavefunction (17). Again this needs to be averaged over the target configurations at the amplitude level. Defining the Wilson line structure at the cross section level can be written as Here the identity was used to express the adjoint Wilson line in terms of the fundamental representation Wilson lines. In Eq. (32) the mean field limit O 1 O 2 = O 1 O 2 was used to express the expectation value of the Wilson lines in terms of the dipole correlators S ij . In particular we note that no higher multipole functions (traces of n > 2 Wilson lines) appear even at finite N c in the mean field limit when we evaluate the diffractive cross section, in contrast to the inclusive two or three jet production case [84]. However at finite N c the BK [14,16] or JIMWLK [85][86][87][88][89][90][91][92] evolution would introduce an implicit dependence on such correlators. The dipole scattering amplitude 1 − S 01 satisfies the small-x BK or JIMWLK evolution equation. The necessary non-perturbative input for this evolution (initial condition at moderate x) can be determined by performing a fit to the HERA inclusive structure function data [93][94][95][96] as e.g. in Refs. [34,[97][98][99][100]. Instead of the BK/JIMWLK evolution one can also use phenomenological parametrizations such as the IPsat [101] model where again the model parameters can similarly be constrained by HERA data [102,103].
Although superficially different, our formulation here is equivalent to the "outgoing state" formulation used e.g. in Refs. [75,104]. The generic amplitude (8) is given by a matrix element between two dressed states M ∼ D out|Ŝ − 1|in D . The outgoing state approach consists in first expressing the incoming dressed state |in D in terms of the bare Fock states just like we do (formally expressed as a time evolution operator acting on a bare asymptotic state). One then passes through the shockwave, and obtains the state (Ŝ − 1)|in D also in terms of the bare states. In the outgoing state formulation one then, instead of taking a matrix element with the dressed state D out|, first inverts its Fock state expansion, expressing the bare states at the shockwave in terms of the dressed states. Then inserting this inverse Fock state expression into the expression (Ŝ − 1)|in D , one obtains the outgoing state (Ŝ − 1)|in D in terms of the dressed asymptotic (future) states. From here one can either read off the amplitudes, or first square them and think of the projection operator |out D D out| as a particle number operator counting what are in Refs. [75,104] called bare particles at x + = ∞, which we would here call dressed particles. In other words, in the outgoing state formalism one is acting with the time evolution operator from x + = 0 to x + → ∞ on the state (Ŝ − 1)|in D to express it in terms of the states |out D , whereas here we use the inverse time evolution operator from x + → ∞ to x + = 0 to get D out| in terms of the bare states at x + = 0. The connection is easiest to see in terms of the diagrams for the amplitude, which are the same in both approaches and lead to the same expressions 3 .

V. FINAL STATE PHASE SPACE
The diffractive structure functions are measured at fixed invariant mass M 2 X . On the other hand, the diffractive 2-or 3-parton production cross sections in Eqs. (25) and (26) are written in terms of the three-momenta of the quarks and gluons. These momenta are, in turn, obtained by Fourier-transformation from coordinate space, which is how we understand the interaction with the target color field. The only place where the momenta of the final state particles appear is in the exponentials of this Fourier-transform and the delta function setting the invariant mass to M 2 X . In order to get the final diffractive structure function one needs to integrate over the final state momenta with the restriction on M X . It can be convenient to do this before integrating over the coordinates of the particles. This results in generic "transfer functions" from a coordinate space squared amplitude to the final states with mass M X . These functions, one for the two-and another one for the three-particle final states, are the same for all states with the same number of partons. Thus, it makes sense to calculate them separately. Here, we will consider the two-and three-particle phase space integrals separately in subsections V A and V B.

A. 2-particle phase space
In terms of the reduced wavefunction ψ γ * λ →q0q1 := ψ γ * λ →q0q1 (x 0 , x 1 , z 0 , z 1 ) defined in Eq. (16) the diffractive qq production cross section (25) reads where we have written S ij = S(x i , x j ) and x ij = x i − x j . The virtual photon polarization is denoted by λ, and the transverse coordinates in the complex conjugate amplitude are x i . The overall color factor N c is obtained when performing the color algebra in the final state requiring that the outgoing state is a color singlet, see Eq. (30). In order to obtain the diffractive cross section at fixed invariant mass M X and squared momentum transfer t, we need to integrate over the three-momenta p 0 and p 1 and introduce delta functions that enforce the required kinematics. Towards this goal we define the following transverse momentum variables which should be understood as the total momentum transfer from the target to the diffractive system, and the relative 3 Note however that Refs. [75,104] use a different normalization for single particle states and for phase space integrals.
momentum of the quark-antiquark pair 4 . These satisfy: and neatly the Jacobian is unity, i.e. in (25) we may replace d 3 p 0 d 3 p 1 → (q + ) 2 d 2 ∆ d 2 l dz 0 dz 1 with z i := p + i /q + . With this change of variables, and imposing the M X and t constraints, the diffractive cross section (25) where we defined and As the reduced photon wavefunction ψ γ * λ →q0q1 can only depend on the coordinate separation r := x 0 − x 1 , it is useful to make a change of variables from x 0 , x 1 to the dipole size r and the impact parameter b = (x 0 + x 1 )/2 (using again coordinates with a bar for the complex conjugate amplitude). In these coordinates we get In the most general case (i.e. without further assumptions), it is also possible to perform the l integral which gives This transfer function is related to the probability to form a final state with the given invariant mass M X given the dipole sizes r and r in the amplitude and conjugate amplitude with fixed longitudinal momentum fractions z i for the quarks. The integral over ∆ is of the same form, and gives Eventually we are also interested in t-integrated diffractive cross sections and structure functions. Integrating over the squared momentum transfer t we find The most general result for the total diffractive cross section in the case where the final state consist of two particles is then given by Eq. (40) with the phase space integrals I M X given above. In particular, we emphasize that the cross section (40) can not be in general written in a factorized form commonly used in the literature [58,59,105] where the result is expressed as a square of an integral over the transverse coordinates in the amplitude. In Sec. VI B we discuss in detail the approximations necessary to obtain a form for the diffractive cross sections where the dependence on impact parameter, amplitude coordinates and conjugate amplitude coordinates factorize and enables one to write the result in the "squared integral" form.
We finally note that the "off-forward" phase in the amplitude coupling the dipole size and the momentum transfer is exp i 2z0−1 2 ∆ · r as shown in Eq. (43), and not exp (i(1 − z 0 )r · ∆) as has been commonly used in the literature based on Refs. [106,107]. The correct phase factor has been discussed e.g. in Refs. [108,109]. We furthermore note that if one used the center-of-mass b = z 0 x 0 + z 1 x 1 as an impact parameter (which would be a natural variable in light cone perturbation theory), no such an off forward phase would appear and the phase factor in Eq. (43) would be just exp i∆ · (b − b ) .

B. 3-particle phase space
Let us next consider the phase space integral for the case where there are three particles in the final state, referring to the qqg system in this work. Analogously to the 2-particle case discussed above, the starting point is the total diffractive cross section (see Eq. (26)) at fixed M 2 X and t: where the color factor tr(t a t a ) = N c C F is again obtained as shown in Eq. (33) in Sec. IV, p 0 , p 1 , p 2 are the fourmomenta of the produced partons, and their transverse coordinates are again labeled as x 0 , x 1 , x 2 in the amplitude, andx 0 ,x 1 ,x 2 in the conjugate amplitude. The plus momentum fractions are again denoted by z i , and the transverse integral normalization is defined as x := d 2 x 2π . Note that this introduces an explicit (2π) 6 in Eq. (48), but will lead to nicer expressions in the end.
Next we define the following transverse momenta Here P i could be interpreted as the momentum of the particle i with respect to the center of mass of the 3-particle system before the scattering (i.e. the momentum q). The momentum ∆ is then the total momentum transfer from the target to the scattering system, and K the relative momentum of the gluon with respect to the quark. The remaining P then is proportional to the relative momentum of the antiquark with respect to the quark-gluon system, which becomes more obvious if one writes it as P = z 0 z 1 [(P 0 + P 2 )/(1 − z 1 ) − P 1 /z 1 ]. Using the above variables allows us to rewrite the invariant mass of the final state particles in a simple way as the sum of two squared momenta: Now we need to apply the same changes to the exponential phases in the integral (48): To obtain the cross section differentially in invariant mass and squared momentum transfer, we again integrate over all three-momenta and include delta functions that impose the required constraints. This gives where we have again separated the transverse momentum integrals: The integral in I ∆ can be evaluated using standard methods in spherical coordinates, yielding Eventually we want to calculate t integrated diffractive structure functions. The integration over the squared momentum transfer t gives: where b := z 0 x 0 + z 1 x 1 + z 2 x 2 is the center-of-mass of the qqg-system, andb is the respective coordinate in the conjugate amplitude 5 . The calculation of I M X is more involved, and proceeds by Fourier transforming the δ-function: To simplify the notation, we define the following transverse coordinates: With these and Eq. (60) -and completing some squares -we can write Computing the -now Gaussian -transverse momentum integrals requires shifting η → η + i in the complex plane. With this we have ∞ 0 dze iηz = i/(η + i ), and so This allows us to write the remaining η-integral as a residue in the lower half of the complex plane: where we can now take → 0. The singularity at η = 0 is an essential singularity, which means that we can read the above residue as the coefficient of the 1 η -term in the series expansion of the residue function. Defining and so the residue is found at 2 + m − n = 1 =⇒ n = m + 1. Thus we finally have where Y 012 := Y 012 , and the series expansion of the Bessel function of the first kind was recognized: In terms of the quark, antiquark and gluon coordinates, the transverse distance scale appearing as a conjugate to the invariant mass reads Note that Y 2 012 does not depend on the center-of-mass of the qqg system b = z 0 x 0 + z 1 x 1 + z 2 x 2 (or onb) which is the Fourier conjugate to the momentum transfer ∆. Consequently if the impact parameter dependence factorizes from the Wilson lines as S 012 − 1 = T (b)(S 012 − 1), then the t-integrated diffractive cross section is proportional to The transverse momentum integrals I ∆ combined with the virtual photon wavefunctions and qqg-target scattering amplitudes can now be directly used to calculate the total diffractive cross section at fixed invariant mass M 2 X using Eq. (55).

VI. LEADING ORDER DIFFRACTIVE CROSS SECTION
In this section we present for completeness a derivation for the qq contribution to the leading order diffractive cross section. The calculation is organized as follows. First in Sec. VI A we review the leading order photon wavefunction describing the γ → qq dipole dipole transition, and show the squared wavefunction needed in the case of DDIS. In Sec. VI B we derive the general leading order result for the diffractive cross sections, after which we discuss in detail what approximations are necessary in order to derive the form commonly used in the literature.

A. Coordinate space wavefunction
The wavefunction describing the tree-level γ * → qq splitting is required to evaluate the leading order cross section Eq. (40). In D = 4 dimensions the virtual photon wavefunctions in transverse coordinate space read [78] (see also Refs. [31,32]) for the longitudinal photon, and for a transversely polarized photon with polarization λ, with x 01 := x 01 andQ := √ z 0 z 1 Q. The quark fractional charge is denoted by e f . Note that our convention to pull out a normalization factor (2q + ) from the definition of the reduced wavefunction in Eq. (16) allows us to write the wavefunctions in terms of the longitudinal momentum fractions z i with no explicit dependence on q + . In order to calculate the diffractive cross sections we need the squared wavefunctions in the case where the quark transverse coordinates are different in the amplitude and in the conjugate amplitude. Summing over the quark helicities these squares read h0,h1 and A more familiar form of the momentum fraction dependence can be obtained noting that (z 0 − z 1 ) 2 + 1 ≡ 2(z 2 0 + z 2 1 ), which holds under the plus-momentum conservation z 0 + z 1 = 1.

B. From wavefunction to diffractive cross-section
The total diffractive cross section at leading order can be obtained by substituting the virtual photon wavefunction in Eq. (40), and using the phase space integrals I (2) ∆ and I (2) M X given in Eqs. (43) and (45). Let us first consider the case where the virtual photon is longitudinally polarized. We note that although the squared wavefunction (72) factorizes as K 0 (x 01Q )K 0 (x 0 1Q ), the diffractive cross section can not be written simply as a square of an amplitude, since the phase space integral I (2) M X mixes the transverse coordinates in the amplitude and in the conjugate amplitude even after an integral over the total momentum transfer, see Eq. (45).
In order to derive the leading order results for the qq contribution to the diffractive structure functions commonly used in the literature [58], further approximations are required. In particular, we assume that • The invariant mass M 2 X or virtuality Q 2 is so large that exp −i 2z−1 2 ∆ · (r −r) ≈ 1 (note that r 2 ,r 2 1/Q 2 , 1/M 2 X ). In this case, the momentum transfer integral of I (2) ∆ gives only δ (2) b −b , see Eq. (47). This is also the case if the dipole-proton scattering amplitude depends on the center-of-mass of the qq system b = z 0 x 0 + z 1 x 1 and not on the impact parameter b = (x 0 + x 1 )/2 (see discussion in Sec. V A).
• The dipole-target interaction does not depend on the orientation of the dipole or that of the impact parameter, i.e. S 01 ≡ S( x 0 − x 1 , b ) =: S rb . The angular dependence is commonly neglected when the initial condition for the BK evolution is determined by fitting the HERA data [34,[97][98][99][100] and in parametrizations such as IPsat [101]. However, in general such a (probably weak) angular dependence should exist, see e.g. [109][110][111][112][113].
Under these assumptions, the only dependence on the angle θ r,r between r := x 01 and r := x 0 1 is in the phase space integral I and as such it factorizes into parts that only depend on transverse coordinates in the amplitude or in the conjugate amplitude. Integrating over t results in a delta function forcing b =b (see Eq. (47)), and the cross section becomes dσ D L,qq where we have substituted the longitudinal photon wavefunction summed over helicities shown in Eq. (72). Note that if the impact parameter dependence factorizes from the dipole scattering amplitude, i.e. (S rb − 1) ≡ T (b)[S r − 1], then the impact parameter integral completely factorizes and gives d 2 b|T (b)| 2 . On the other hand, for the transversely polarized photons, the leading order reduced wavefunction squared depends on the angle between r andr as shown in Eq. (73): which means that the part that depends on the dipole sizes r andr -omitting the dipole amplitudes for nowreads: Parametrizing the angles as ∠(l, r) =: θ and ∠(l, r) =:θ, we have for the dot product: r·r = r r(cos θ cosθ +sin θ sinθ), where the sine term vanishes in the integration. Thus we are left with Consequently, we find that (only) under the assumptions listed at the beginning of this subsection the diffractive cross section can be written in a factorized form independently of the photon polarization. Otherwise the transverse coordinates in the amplitude and conjugate amplitude are mixed. In the future it will be interesting to study numerically the effect of these assumptions, that were used e.g. in Ref. [59] where a good description of the HERA diffractive structure function data was obtained. Both cross sections can now under these assumptions be expressed in terms of an auxiliary function, denoting now z 0 = z with the integral over z 1 is performed using the δ function Using this, we may write the diffractive structure functions as which is in agreement with Ref. [59], once one accounts for the different normalization of the dipole amplitude and the different integration domain of z.

VII. TREE-LEVEL qqg-CONTRIBUTION TO THE DIFFRACTIVE STRUCTURE FUNCTIONS
In this section we present the main result of this paper: the tree-level calculation of diffractive qqg production as a function of M 2 X and t. We consider the case where the gluon is emitted before the shockwave and the qqg system then interacts with the target, corresponding to the diagrams (b), (c), (d) and (e). The emission-after-shock contribution could then in principle be obtained by taking the appropriate coordinate limits following the method developed in Refs. [75,104]. As discussed in Sec. III, we will however leave it to a future publication. For simplicity we only consider the massless quark limit in this work.
The diffractive qqg production cross section was written in Sec. V B (see Eq. (55)) in terms of the phase space integrals I × h0,h1,λ2 The only part missing from the diffractive qqg production cross section is thus the calculation of the square of the tree-level wavefunction ψ γ * λ →q0q1g2 , with different transverse coordinates in the amplitude (x i ) and in the conjugate amplitude (x i ), summed over helicities. The plus momentum fractions z i are external kinematical variables and therefore the same in the direct and complex conjugate amplitude. We calculate this square using the wavefunctions for the longitudinally and transversely polarized photons in 4 dimensions from Ref. [32] (see also Refs. [30,[38][39][40]), with the modification that the factor of 2q + has been taken out of the reduced wavefunctions ψ in the definition (17). The reduced LFWF for the longitudinal photon reads ψ Tree whereas for the transverse photon we have: where X 012 , x 0+2;1 and x 0;1+2 are defined as: The quantity Q 2 X 2 012 corresponds to the qqg formation time divided by the lifetime of the virtual photon that forms the qqg system, as discussed in more detail in Ref. [30]. Configurations with large Q 2 X 2 012 are exponentially suppressed, which enforces the restriction that the qqg state must develop within a formation time that is less than the lifetime of the virtual photon.
The calculation of the squared wavefunctions h0,h1,λ2 ψ γ * λ →q0q1g2 † ψ γ * λ →q0q1g2 is cumbersome but straightforward. More technical details are given in Ref. [62]. After a lot of algebra, we obtain the diffractive structure functions for the longitudinal structure function, and reg.
for the transverse structure function. The Υ-terms of the squared virtual photon light-front wavefunction are: It might be elucidating to note that the above expressions satisfy the following symmetries in particle exchanges: reg.
making their sum symmetric under the exchange of the quark and antiquark. Meanwhile the (b) × (c)-interference term is already a sum of terms that can be obtained by this exchange, and is thus symmetric by itself. The dipole amplitude N ij = 1−S ij satisfies the BK or JIMWLK high-energy evolution equation, where the evolution is parametrized in terms of the evolution rapidity defined as fraction of the probe photon plus momentum carried by the gluon. Following Ref. [34] where the initial condition for the BK evolution has been determined at NLO accuracy, this evolution rapidity can be taken as where W is the center-of-mass energy for the photon-nucleon scattering and Q 2 0 is some typical hadronic transverse momentum scale of the target, and taken as Q 2 0 = 1 GeV 2 in Ref. [34]. These contributions to the diffractive structure functions are finite without requiring any additional cancellations with other diagrams. This is because the invariant mass of the final state is fixed, and thus ultraviolet divergences do not appear. This also means that the divergences in the loop contributions that we are not calculating here should cancel against each other, see discussion in Sec. III. Note that without the M 2 X restriction the integration over the final state momenta would set x i → x i and an ultraviolet divergence ∼ d 2 x 02 /x 2 02 or ∼ d 2 x 12 /x 2 12 would appear. In the diffractive structure functions the corresponding structure is ∼ d 2 x i2 x i2 /x 2 i2 (with i = 0, 1) which is UV finite. The integration over the momentum transfer sets the center-of-mass of the qqg system b to be the same in the amplitude and in the conjugate amplitude, b = b, but this does not affect the behavior of these integrals in the ultraviolet region (note that I A potential (transverse) infrared divergence is removed, for a gluon emitted before the shockwave, by the dipole amplitude part vanishing for |x 02 | ∼ |x 12 | → ∞. For the emissions after the shockwave that we are not calculating here, these configurations are not suppressed by the Wilson line correlator, and need to cancel against the wavefunction renormalization of the outgoing quarks. Similarly there is no soft gluon divergence in the limit z 2 → 0, as the invariant mass, which we keep finite, gives a lower bound for the integral z 2 1/M 2 X . For a parametrically large M X our result would give a large logarithm ∼ ln M 2 X from the lower limit of the z 2 integration. While such contributions could be resummed [114], they are not easily accessible at EIC energies and we will not consider this resummation further here. The BK/JIMWLK evolution of the target, on the other hand, is associated with the z 2 → 0 limit of contributions where the gluon crosses the shockwave, but is reabsorbed and not measured in the final state, which we are leaving to future work.
The interpretation of the cumbersome Υ-terms is actually straightforward. First, Υ describes the contribution where the gluon is emitted by the quark in the amplitude and absorbed by the same quark in the conjugate amplitude. Similarly, Υ corresponds to the case where the antiquark emits and absorbs the gluon. Furthermore, the instantaneous gluon emission and absorption by the quark (antiquark) is described by Υ inst. , with the former including the interference contributions containing the (d)-diagram, and similarly the latter those of the (e)-diagram.
We emphasize that prior to this work the qqg contribution to the diffractive cross section has only been known in approximate kinematics and for a transverse photon only (which dominates at high Q 2 ), see discussion in Sec. VIII. For the longitudinal polarization even approximative results have been missing from the literature. The cross sections (88) and (89), that are main results of this work, are finite and can be straightforwardly implemented in phenomenological applications. In a future work we plan to apply these results to describe the HERA diffractive structure function data.

A. The large-MX limit
As a verification of our calculation, here we will extract the qqg-contribution for F D T in the limit of large M X , i.e. in the limit when the emitted gluon is soft. This limit has been considered by several authors, e.g. in Refs. [63,[114][115][116][117][118].
To be specific, we will use the result as it is written in Ref. [63] and also used in the phenomenological studies [58,59]. The derivation simplifies considerably in the β → 0 limit, i.e. for final states with arbitrarily large invariant masses M X in comparison to the virtuality of the photon: M 2 X Q 2 . The result of Ref. [63] for the qqg-contributionincluding gluon emissions from the quark-antiquark dipole both before and after the scattering off the shockwave 6 -is, converted to our notations: A key feature of this result is that the qqg LFWF has been factorized to the leading order qq LFWF, the BK kernel describing the gluon emission, and the scattering of the tripole state in terms of scatterings of daughter dipoles.
Our starting point to rederive the large-M X result is Eq. (48), with the final state momentum integrals still undone. First we note that the leading process to create high invariant mass final states is caused by the emissions of soft gluons, with z 2 1, and at the limit z 2 → 0 we have M 2 X ≈ P 2 2 /z 2 . We consider a t integrated cross section, whose momentum dependence in this limit reads where we assumed that M 2 X is dominated by the gluon light cone energy p 2 2 /z 2 . We have here integrated over z 2 , assuming that M X is large enough so that it is always possible to find a solution to the delta function constraint z 2 = p 2 2 /M 2 X with 0 < z 2 < 1. In reality this is not possible for arbitrarily large p 2 . Thus our approximation has rendered the p 2 integral unbounded, resulting in a delta function setting x 2 − x 2 . This approximation, as discussed in Sec. III and above in Sec. VII, would make the cross section UV divergent unless one also includes the diagrams with emission after the shockwave, using the procedure of subtracting from the "emission before" term the appropriate coordinate limit. Thus we will include these contributions here, unlike in the rest of this paper.
At this point we can integrate over the coordinates in the conjugate amplitude using the delta functions. Starting from the equation (48) we now get For calculating the subtraction terms correctly, we need to decompose the γ → qqg wavefunction into gluon emission and effective γ → qq parts according to Eq. (9). Here the limit of z 2 → 0 simplifies things dramatically. In the soft gluon limit the wavefunction is given by (note that in our normalization conventions for the reduced wavefunctions, see Eqs. (16) and (17), this relation is true without additional coefficients): where the effective γ → qq wavefunctions ψ γ * λ →q0q1;q0→q0g2 and ψ γ * λ →q0q1;q1→q1g2 defined by Eq. (9) have become independent of the gluon emission in the limit z 2 → 0. This simplification can be understood in several equivalent ways. In coordinate space, the argument is that for a soft emission z 2 → 0 the coordinate of the emitting particle does not change, and therefore the original γ → qq splitting is independent of the later gluon emission. For momentum space wavefunctions the reason is that in the limit z 2 → 0 the gluon light cone energy blows up and thus energy denominators for gluon emission become independent of the state emitting the gluon shockwave, see discussion in Ref. [118]. The normalization condition for the photon state can only be used to obtain such contributions in the soft gluon limit where the γ → qqg wavefunction factorizes into γ → qq and gluon emission wavefunctions, but not in general kinematics [31,32].
This leads to a factorization of the γ → qqg wavefunction into leading order γ → qq and q → qg,q →qg wavefunctions in momentum space, which is carried over into coordinate space. We recall from Sec. III A that the subtraction procedure consists in pulling out the gluon emission wavefunction and then taking the limit of the gluon coordinate being equal to its emitter in the remaining factors. In this case we can also factor out the common γ → qq part. Thus we get, restoring the coordinates for clarity We now use the coordinate limits of the "tripole" (see e.g. [32,33]) operators and the relation (see also e.g. [32,33]) We can now use the gluon emission wavefunction, which in our conventions (see Eq. (37) of Ref. [77] and recall the normalization of the reduced wavefunction from (16), (17)) reads and similarly for the antiquark. Using this relation in Eq. (100) enables the familiar calculation of the BK kernel Inserting the squared wavefunctions from Eq. (108) and (106) into (100) one then directly recovers the result (98).

B. The large-Q 2 limit
In this section we will rederive the Wüsthoff result for the qqg-contribution to F D T [55,57,105], which is an approximate result at the large-Q 2 limit for this next-to-leading order contribution. Specifically we will verify that the Wüsthoff result emerges from the NLO result calculated in exact kinematics when one takes the large-Q 2 limit. The Wüsthoff result for the qqg-contribution has been extensively used in phenomenology [57][58][59]61], and it has some special features we seek to understand in depth. It can be written in coordinate space in the appealing short form: 7 FIG. 12: Kinematics for the diffractive qqg-production diagram. Left: dipole picture frame, where the probe has a large q + which is conserved in the interaction with the target. We show the plus and transverse momenta, with Pqq the relative transverse momentum of the qq pair and Kgg of the gluon-effective gluon dipole. Right: infinite target momentum frame, where one tracks the minus component of the momentum, i.e. the target momentum fraction. The zigzag line refers to the pomeron emitted from the target, and is used to illustrate a generic diffractive interaction between the virtual photon and target. The two scattering pictures are connected by the invariants of the scattering. In the GBW limit the qq pair forms a hard system that is seen as pointlike by the target color field, thus M 2 qq and Pqq are the same before and after the shockwave. In the aligned jet limit z2 z1 z0 ≈ 1 we have M 2 qq ≈ P 2 qq /z1, and from the collinear factorization picture on the on the right: The invariant mass of the qqg system and Kgg, on the other hand, are affected by the interaction with the shockwave. In the dipole picture, before the shockwave, we have M 2 qqg ≈ M 2 qq + K 2 gg /z2. The invariant mass after the shockwave is M 2 while a momentum space version can be found e.g. from Ref. [57]. An essential feature of the large-Q 2 structure function is the manifestation of the DGLAP splitting function for g → qq splitting. This is associated with the DGLAP evolution of the parton distributions of the pomeron, and is written in terms of the target minus momentum fractions β and z (recall that we work in a frame where the photon has a large plus momentum). In the frame where the target has a large longitudinal momentum β can be interpreted as the fraction of the pomeron momentum carried by the stuck parton, and it is also related to the invariant mass of the final state as β = Q 2 /(Q 2 + M 2 X ). The fraction z is the fraction of the pomeron minus momentum transferred to the qq system. On the one hand, the presence of the splitting function in the large Q 2 limit is unavoidable in QCD. On the other hand, since we are using light cone perturbation theory to quantize the projectile photon, the light cone momentum fraction ξ = β/z does not appear spontaneously in the calculation. These kinematical variables are illustrated in Fig. 12.
The 3-particle phase space has been treated only approximately in the Wüsthoff result. The result is written in terms of the size r of an "effective adjoint dipole" formed by the gluon and the quark-antiquark pair. Thus the interaction with the target color field appears in the form of an adjoint representation dipole cross sectionσ dip . There is also an explicit log Q 2 resulting from the integral over some of the "internal" kinematics of the qq pair as we will demonstrate explicitly below.
The contribution corresponding to the DGLAP splitting function [(1 − β/z) 2 + (β/z) 2 ] originates from the part of phase space where emissions are strongly ordered in transverse momenta. Since we want to obtain the DGLAP splitting function with fixed momentum fractions in k − , this implies that the k + momentum fractions must also be strongly ordered. This is the Bjorken aligned jet [119] configuration. Thus it is more convenient to work in momentum space to make the connection to the Wüsthoff result. To make this more specific, we choose the transverse momenta for the 3-particle qqg final state (after the shockwave) as K gg := (z 0 + z 1 )P 2 − z 2 (P 0 + P 1 ) , where p i are the transverse momenta of the final state partons, and the variable P qq can be interpreted as the relative momentum between the quark and antiquark, whereas K gg is the momentum of the gluon with respect to the quark-Next we move on to manipulate the virtual photon splitting light-front wavefunction in the aligned jet limit. The γ * λ → qqg LFWF in momentum space in the convention of Refs. [31,32] is Σ ij (e) = 1 4 Additionally it will be useful to simplify the momentum dependence of the qqg LFWF by using the three particle state invariant mass: With these simplifications the normal emission part of the wavefunction (132) becomes In the kinematic regime P 2 qq K 2 gg and we can first simplify the term in the square brackets as . . . = −P i qq ((P qq + K gg ) 2 + z 1 Q 2 )K k gg + (P qq + K gg ) i (P 2 qq + z 1 Q 2 )(K k gg − z2 z1 P k qq ) ((P qq + K gg ) 2 + z 1 Q 2 )(P 2 qq + z 1 Q 2 ) ≈ −2P qq · K gg P i qq + (P 2 qq + z 1 Q 2 )K i gg (P 2 qq + z 1 Q 2 ) 2 where the last term is kept since in this strong ordering limit we have z 2 /z 1 ∼ K 2 gg /P 2 qq , whereas the associated term proportional to (z 2 /z 1 )K i gg P k qq is discarded as a higher order term. Out of the instantaneous emission terms, the contribution from diagram (e) is enhanced by a factor of 1/z 1 w.r.t. diagram (d), which means that in the aligned jet limit, we can neglect the contribution from the latter, and only the former contributes. Using the identity we see that the quark (and consequently antiquark) helicity is completely fixed by the photon polarization. This is a common feature of LCPT vertices: the particle carrying all the longitudinal momentum in a splitting inherits the light front helicity of the parent [77]. We can now combine the leading instantaneous contribution with the regular emissions: Next we move on to calculate the squared amplitude. Here it is important to recall that the momenta K gg andK gg in the DA and CCA, respectively, are separate, whereas P qq ≡P qq ≡ P qq due to Eq. (123). We need the following algebra: and −2 ξ z 1 Q 2 ( P qq · K gg ) P i qq K j gg + K i gg K j gg −ξ P i qq P j qq Now we remember that at the cross section level we are integrating over the phase space d 2 P qq with P 2 qq = z 1 (1 − ξ)Q 2 /ξ fixed, i.e. over the angle of P qq . This means that we can replace Thus we get ingredients of the large-Q 2 result: a DGLAP-type logarithm in Q 2 , a splitting function P g→qq and also the somewhat curious traceless rank-2 tensor "photon to effective gluon dipole wavefunction" (see Eq. (162)), from a dipole picture calculation. Thus we believe that the method of this calculation can be helpful in more general for matching the dipole picture with the collinear factorization limit. Experimentally, the diffractive structure function is a key part of the program in high energy DIS experiments, both at HERA and at the future EIC. Compared to e.g. diffractive dijets, it is a clean and well defined observable without requiring high transverse momentum or heavy quarks in the final state. This makes it possible to access smaller values of x Bj at a finite collision energy than for dijet observables, and thus to achieve a better sensitivity to gluon saturation. The nuclear modification of diffractive structure functions has already been identified as a key observable for gluon saturation at the EIC [1].
Our results are presented in a form that can directly be applied to phenomenology. They generalize the large-Q 2 and large-M X results used in earlier phenomenological studies to a more precise kinematics. Depending on assumptions on the impact parameter dependence of the dipole amplitude, various different simplifications are possible. However, our main result is completely general in this regard and can be applied to any impact parameter dependence. It will be interesting to evaluate the diffractive structure function numerically, both in order to compare to earlier limiting results, and to test dipole amplitude parametrizations against a new set of experimental data. On the theory side, we have in this paper also outlined the necessary steps to complete the NLO calculation of the diffractive structure function. Here there are many recent results that can be taken advantage of. The loop corrections to the γ → qqg light cone wavefunction are known [31][32][33]. So is the procedure to factorize the large logarithms of x P into the BK/JIMWLK evolution of the target [62,122], once the corresponding diagrams (with a gluon crossing the shockwave but not the cut) are calculated. The treatment of final state gluon exchanges poses interesting conceptual questions that are new in the context of LCPT. While in many ways straightforward, these further calculations are sizeable enough projects that they are best left for future publications.